[ros-dev] I'll continue being away
Hi all, Only an update of my situation: I have been working full time for a company for two years now, and I'll continue doing so for undetermined time (read long time). I'm still very interested in ROS and I keep working on it some of the very little spare time I currently have. I do it fully independently as I don't have the time to discuss what I'm doing, nor I assume that you are interested in any way in what I do (most probably not from what we discussed before), but if I ever get something finished to contribute, of course you will know. Best wishes, Jose Catena ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
[ros-dev] jcatena's status
Timo told me some of you asked about me. I should have droped a line here before. In february I signed a contract that will keep me very busy until the end of july. During this period I'll have very little time for reactos, but I intend to keep working in the kernel whenever possible. It would be nice if we could earn our living with this. Since I need the money, I have to work in a less interesting but well paid project. Jose Catena ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Re: [ros-dev] Debugging at the next level...
Great work Timo!!! I was also using windbg with pdb with my 32 bit mscv kernel version, it really makes debugging so more powerful and easy… PDB support was one of the main reasons I wanted to build with msvc. I hope to finish very soon a few modifications to the new trap handler so that it can also be compiled with msvc. Currently the traps are working and the interrupts will soon. Once I have the 32bit kernel compiled with msvc working again, I’ll check your stuff to see how could the 32 and 64 bit builds share the most. I’m afraid that Sir Richard didn’t understand that his trap handler can’t be compiled with anything but gcc, most probably my fault coz my unpolite way to communicate it. But I hope that after I have my branch booting and he could check it out, will accept my apologies and reconsider adopting the modifications, coz otherwise the trunk will not be compatible with anything but gcc. Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
[ros-dev] msvc build update
Hi, I'm being asked about the msvc build and decided to answer in the list to keep you all informed on the status. Today I submitted a non working update to my branch: svn://svn.reactos.org/reactos/branches/jcatena-branch useful for reference for those interested in msc builds. Before I got a svn branch where to commit, I got ntoskrnl built by msvc9 and booting to desktop, Based on build 44678. This is not synchable with svn, but I have put it in my server if someone wants to take a look: ftp://diwaves.com/reactos/ntos_44678_local.zip After I got a svn branch, I started to make the same happen without changing so much. Unfortunately this was in the middle of the trap handling rewrite. This is being done in a really weird, unportable, hackish and unefficient way, and I had a lot of problems to have the port booting. Finally, after much more work than expected, I got the port working but with some weird temporary hacking I didn't like, and then wanted to sync with more current version, only to find that the trap handlers were substantially changed again, in an even more unportable way. So instead of keep debugging and hacking a code that i would pretty much drop at the end, I started rewriting it yesterday and I expect to have it working in less than a week. This will save efforts at the end. Initially I wanted to commit only a working branch, but since it's taking time and other devs may benefit from what's done even if it is not complete, i decided to have the working not synchable version based on 44678 available at my server, and the non booting svn branch r45369 commited. I recommend taking a look at platf.h and cpu-c.h, these files contain platform and compiler dependent definitions and macros, that make most inline assembly and compiler conditionals elsewhere unnecessary. Hope you find this useful. Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
[ros-dev] msvc build update
Hi, Sorry, I placed the old nonsychable but working version in a restricted folder, correct downloadable link is: http://diwaves.com/reactos/ntos_44678_local.zip Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Re: [ros-dev] msvc build update
Hi, Don't feel offended, the port to c is a very good thing and indeed better in the most part than the previous asm implementation. I value your work very much. But there are some aspects badly implemented: 1) It is not possible to write a good stub in an inline function nor in a regular macro due to c preprocessor limitations. Your implementation trust the optimizer to remove constant conditionals, but it does not work as well as you may think. Have you seen the generated code? It has a lot of unnecessary branching. I have implemented it as an x-macro, where I can use the preprocessor to select the options to be generated and it does not generate any run time conditional branching. 2) You have added many inline assembly parts in several files that should be kept as portable as possible. These things are better put together in architecture dependent files/directories. Furthermore, most are unnecessary, my single trap stub in a single x-macro replaces most of your inlines. 3) Since the pointer to the trap frame is passed to handlers, why do you pass as extra parameters information that is already there? Access to members of the KTRAP_FRAME is just as fast as access to local variables or parameters, but saves on additional copies and allow more efficient register usage. Furthermore using regparam(3) for functions called from inline assembly, when actually only one param was needed, was really a very bad idea. And you pass the same parameters to child handlers in cascade, again and again. This is very unefficient. Today I simplified the syscall and fastsyscall handlers removing a lot unnecessary parameter copying and cascading, saving a lot of cycles and memory, while making it much easier to read and understand. 4) Why did you resort to code patching to encode the PKINTERRUPT parameter, instead of generating a static variable named together with the stub, for example? Wasn't it hackish? I hope you understand that you won't save even a cycle that way, so why to use such an unportable and complex method? The worst thing is the indiscriminate and spread use of inline assembly and compiler specific features. Please think in placing such things in platform specific includes and try to make them as general and reusable as possible, instead of making the code base very difficult to port and maintain. Please take this as constructive criticism and not as any personal offense. We all can always learn better if we can accept criticism. Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Re: [ros-dev] Arwinss presentation
Technically Arwinss may not be the best possible architecture, but IMHO right now is the only viable one in order to reach beta in reasonable time. Sure, we will always see how a more native implementation could be more efficient at the end, but the reality is that given the current human resources it is not a realistic approach, it simply won't happen in many years with the actual resources. Arwinss allows us to use most of a working win32k subsystem (wine's) with minimal effort, thus saving huge amount of work. So we can focus in implementing other very needed areas to have a complete os. Why to invest an huge resources that we don't even have to implement something what is already done, better or worse? After we have the needed partitions, filesystems, complete kernel compatibility, etc, if we have more resources, we may consider t keep the the native win32k ss development, with the advantage of having a complete and working system to compare and test against, and most probably with more resources after we deliver an usable system. ReactOS goals are to achieve maximum windows compatibility at both application and driver/kernel components. We don't have any part finished. Arwinss solves with minimal effort the application APIs side that would require he largest effort otherwise, so we can dedicate our very limited resources to finish the other parts. It is just my opinion, but I see it so clear. I hope you all understand this: the best architecture can be the worst one if there is no a realistic plan to develop it. Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Re: [ros-dev] Request for a project
I second that, a NTFS driver would be very much appreciated. But indeed we need to complete IFS support in the kernel, including CC. I was told someone is working on CC right now. I also will need CC and any IFS related support working to run windows over our kernel, so Ill also be working on it soon. Who is working on CC and how is it going? Jose Catena DIGIWAVES S.L. From: ros-dev-boun...@reactos.org [mailto:ros-dev-boun...@reactos.org] On Behalf Of Javier Agustìn Fernàndez Arroyo Sent: Wednesday, 13 January, 2010 13:10 To: ReactOS Development List Subject: Re: [ros-dev] Request for a project i was thinking in NTFS support :P (yes, i mean to create a free and open source NTFS driver for ReactOS) but it needs a working cc before, i guess and probably its a 5 months work ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Re: [ros-dev] [ros-diffs] [sir_richard] 45052: Patch that fixes VMWare boot (and
Also, if anyone has any experience with MSC++ and could port the GCC-centric macros, that would be much appreciated. I'm doing that. Right now I compile and run ntoskrnl in msc (vc9). Runs mostly but I'm still fixing bugs. I have written new small platform and compiler dependent includes and eliminate all unnecessary conditionals everywhere else, this cleans up things a lot. But, I asked if my first patch was reported correctly and if I should submit more, received no answer, and it's still not reviewed. I suppose because I am not a respectful enough noob and didn't start kissing asses. So I'm not going to submit any more unless this childish attitude is fixed. But I'll put all my work periodically in my server and announce the availability here so that anyone interested could download it and use whatever you want in ReactOS. Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Re: [ros-dev] MSVC
-Original Message- From: ros-dev-boun...@reactos.org [mailto:ros-dev-boun...@reactos.org] On Behalf Of KJK::Hyperion Post them on bugzilla, assign them to me and Cc sginsb...@reactos.org Well, I submitted my first bug patch to bugzilla. Before I submit more, I'd like to know if I did it correctly. I have many and I'll wait to know if should submit them this way. Couldn't cc to sginsb...@reactos.org, bugzilla doesn't know him. Based on the large amount and severity of bugs in the msvc build perhaps nobody is using it but me, plz let me know if you don't want any fixes to msvc build submitted. The submitted bug patch is as follows: Bug# 5071 intrin_i.h: sgdt lgdt fixed for msvc In intrin_i.h there are two inline functions that are defined differently based on __GNUC__ or _MSV_VER. The _MSC_VER ones make ntoskrnl crash early on boot. As these functions were written, sgd lgdt stored and loaded the gdt to/from the pointer variable passed instead of the location the pointer points to. Fixed and tested. Patch follows: intrin_i.h Ke386GetGlobalDescriptorTable(OUT PVOID Descriptor) { - __asm sgdt [Descriptor] + __asm { + mov ebx, Descriptor + sgdt [ebx]; + } } Ke386SetGlobalDescriptorTable(IN PVOID Descriptor) { - __asm lgdt [Descriptor] + __asm { + mov ebx, Descriptor; + lgdt [ebx]; + } } Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Re: [ros-dev] MSVC
As intended in my last post, better work and proof than discussion. Do you agree? You made too many assumptions, forget for a while that you already know everything if you want to learn something. What an audio workstation needs is that the audio events don't delay beyond a predictable maximum. The processing of such events will be always a fraction of the delay. So if the maximum latency is for example 1ms, lower priority rt tasks won't be delayed more than a fraction of a millisecond for this cause. This does not necessarily affect throughput of anything, and anyway never without a good reason. An audio workstation will configure the audio driver to have higher priority DPCs than disk, and disk higher than network, USB, video... A network server will give priority to NIC then to disk then whatever. Is the current scheme better, when all NIC, audio driver, disk will suffer because the mouse DPCs? Proper RT scheduling without anything violating it (like DPCs) can only help in every scenario, how couldn't it? How could the current scheme be more efficient in latency, throughput or anything than this one? Are you joking or what? The rt priorities don't have much to do with throughput, the tasks demanding lowest latency are the ones that will also do its job very quickly. If a rt task does not finish in a very limited time, it does not comply and its priority will be changed to time sharing range, this is one of the keys in my scheme. The useful statistics are very easy to gather efficiently. The threaded DPCs are not a solution when it depends on all drivers employing it (which is not a reality and will not be), and does not address the need of configuration of the priorities scheme by the user. Nor Win7 solves the risks of user configurable real time priorities. MMCSS is not that thing, it is just a try to provide a reserved high priority class to user mode, but it is anyway below *any* and *all* DPCs. What happened with WaveRT in Vista is a demonstration of the absolute lack of knowledge of what realtime scheduling is and the effects of latency, and being separate of the scheduler itself, is a proof of the lack of understanding in the very same regard. And very related to the scheduler too. If it was so hard to learn for them after endless discussions, I wouldn't expect you would a more complex timings relationship, much less knowing your I know it all attitude. You make me laugh, so if my scheduler performs better than the one in Win7 (not only better than the ones in ReactOS) you will support its inclusion? LOL. My aim is well beyond Win7 performance anyway. I'll work on it, and if I success as I expect, you will support its inclusion or not as you wish. But please don't ask me to discuss in that tone and attitude, we all know you're the smartest guy with the biggest dick and I won't waste time challenging you, ok? I don't know Arun Kishan, who is he? MS is so huge... Most ppl I treated is in the wdm area. Also as a team of audio developers leaded by Ron Kuper (Cakewalk) we have been working with them at all levels. After a decade or so the improvements are valuable, we got the KS interfaces published, the nasty priority inversion algorithm removed, an so much more. And still, so far from the right thing. If I did alone a scheduler that solved elegantly and efficiently the same problems (and mine is not as large as yours by far), how a company with the resources of MS doesn't? It is not that they don't have great people, it is not that they don't get well founded and documented requests, it is not a black hand... is the very nature of marketing, every user will find some icons nicer than others, but hardly understands what low latency means. If not, look how you saw at it: if MS didn't how you...? If a developer skilled in windows internals doesn't know about real time concepts... Shields down. It is fun but I don't want to waste time in useless discussions. I'll keep working as I said, and you all will know how it goes. The only big obstacle in the way is the very limited time I have for it, not any of arguments. Not that changing the DPC handling and sync will be easy, but the model I'm trying to implement works great, I have already used it in very different projects. This is not a fool idea of the latest frickie. Back to msvc build fixing... Cheers, Jose Catena DIGIWAVES S.L. -Original Message- From: ros-dev-boun...@reactos.org [mailto:ros-dev-boun...@reactos.org] On Behalf Of Alex Ionescu Sent: Tuesday, 22 December, 2009 06:36 To: ReactOS Development List Subject: Re: [ros-dev] MSVC Sounds like you want make Threaded DPCs (which exist to fill the niche you talked about) the default model. Are you aware of Threaded DPCs? Why not just go down that route? Your idea of targeting DPC-heavy drivers to lower their throughput is what MMCSS attempted (and still does) to do back in Vista -- it partly lead to a catastrophe as suddenly network card traffic on heavy audio I/O machines dropped
Re: [ros-dev] MSVC
Ø May I kindly suggest that he came here to ask if you are interested in what he wants to do ? Just to know if he has the support of the team. That's what it looks like from a non-programmer's point of view. 1) I was simply asking if I should submit fixes and how. 2) I was telling what Im doing for your information. I didnt ask for integration of my project in ReactOS, Ill only ask for it if I finish it successfully. 3) No, Im not asking for support in that sense, I intend to write the new scheduler alone. I dont ignore that it will take a lot of work, but I couldnt expect any involvement by the ReactOS team. 4) I didnt ask for discussion or help. Im well aware of the difficulties and solutions, well beyond the well intentioned objections raised. I hope this clears any confusion and ends any discussion on the subject. Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
[ros-dev] MSVC
I know only RosBE is officially supported. I don't have any problem with that. But I prefer to work on Visual Studio, and perhaps someone that also intends to use it may be interested in the following: Per file custom build steps vs. custom build rules files. The proper way to add support for external tools like the as and nasm assemblers, or any other, is to write a .rules file for each of these tools. This way it is not necessary to specify custom build rules for each file. The rules file for masm is included in vs8, 9 and 10. Once you load a rules file into a project, associated extensions are handled just like any other tool there, making available inheritable properties templates and all that. For example, I wrote an as_mscpp.rules file that reads: ?xml version=1.0 encoding=utf-8? VisualStudioToolFile Name=s as (gnu_as mscpp) Version=8.00 Rules CustomBuildRule Name=s_as_mscpp DisplayName=s (gnu_as mscpp) CommandLine=cl /E [sIncPaths] [sPPDefs] $(InputPath) | as -o [sOutF] Outputs=[$sOutF] FileExtensions=*.s ExecutionDescription=Assembling Properties StringProperty Name=sOutF DisplayName=Obj File Description=Obj File (-o [file]) Switch=quot;[value]quot; Inheritable=true DefaultValue=$(IntDir)\$(InputName).obj / StringProperty Name=sIncPaths DisplayName=Inc Paths Description=Include serach paths (/I [path]) Switch=/I quot;[value]quot; Delimited=true Inheritable=true / StringProperty Name=sPPDefs DisplayName=Preproc Defs Description=Preprocessor Definitions (/D [symbol]) Switch=/D quot;[value]quot; Delimited=true Inheritable=true / /Properties /CustomBuildRule /Rules /VisualStudioToolFile I have more of these for nasm, wmc, etc. Also, I don't know if I should submit patches for msvc build, I suppose I should even if it is not the official way, because I find msvc/gnuc conditionals there, but the msvc part is broken in many places. I fixed ntoskrnl and its dependencies issues (many in crt) for msvc. Please let me know if I should submit such patches or not, and what would be the proper way to submit these and any other patches. I suppose that initially I should send them to someone for review, right? I'd prefer to avoid irc or any instant msg system, I hope this channel is adequate for any communication regarding development. I'm very pleased with the improvements in kd and windbg support. I was working on it and found that I wasted some efforts. No problem, I'll check svn more frequently for changes, but I'd like to know if there is a list somewhere with a description of who is currently doing or planning to do what, to avoid doing the same as someone else again. I forgot to say, I finished a large project and I'll be dedicating part of my time to reactos for a while, partial time, perhaps until February. Not much, because I'll need to earn money again soon, but I hope I could at least get ntoskrnl running in a regular xp or s2003 install for testing, fixing any compatibility issues, and then incorporating to it a new high performance scheduler (I'm very sensitive to the very low performance of windows in this area). If in the near future I can keep giving some time to this, I'll take a look at bugs in any area, unimplemented things, or perhaps I could help a bit with the audio area. I'd love to work with this much more, as os development has been always my specialty and preferred area, but I have to earn my living. Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Re: [ros-dev] ReactOS 0.3.11 source broken?
Did you try running clean at the RosBE prompt before attempting to build the code? Yes, of course. I also tried on a new fresh extracted copy. Jose Catena DIGIWAVES S.L. From: ros-dev-boun...@reactos.org [mailto:ros-dev-boun...@reactos.org] On Behalf Of Sir Gallantmon Sent: Monday, 21 December, 2009 13:28 To: ReactOS Development List Subject: Re: [ros-dev] ReactOS 0.3.11 source broken? Did you try running clean at the RosBE prompt before attempting to build the code? On Mon, Dec 21, 2009 at 1:49 AM, Jose Catena j...@diwaves.com wrote: This is very strange, the published source for 0.3.11 causes compile errors here. I compiled without problems all recent versions from svn up to 44678. The published source must not be the one used to compile the release, or something really weird is happening. I thought you wanted to know. Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Re: [ros-dev] ReactOS 0.3.11 source broken?
Never mind, it was my fault. Now I compiled 0.3.11 sources without stopper errors. I don't know what was the cause of the errors, they disappeared after reinstalling RosBE. It's curious that the RosBE version is the same (1.4.5), and that previous install was compiling svn versions correctly, but I don't think it deserves any further investigation once solved. Jose Catena DIGIWAVES S.L. From: ros-dev-boun...@reactos.org [mailto:ros-dev-boun...@reactos.org] On Behalf Of Gregor Schneider Sent: Monday, 21 December, 2009 14:05 To: ReactOS Development List Subject: Re: [ros-dev] ReactOS 0.3.11 source broken? The source for release versions differes slightly from the SVN source. Still the releases are built from release source, so it should compile with the appropriate RosBE version. What type of error did you get? 2009/12/21 Jose Catena j...@diwaves.com This is very strange, the published source for 0.3.11 causes compile errors here. I compiled without problems all recent versions from svn up to 44678. The published source must not be the one used to compile the release, or something really weird is happening. I thought you wanted to know. Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Re: [ros-dev] MSVC
I'm very glad you hear you saying that, Alex. Those are indeed requirements my implementation will have to comply with, and I think it is totally feasible. It should improve realtime class latency very meaningfully without making any other parameter worse, although I also expect to achieve improvements in other parameters too. What I intend to do is to fully comply with the real time paradigm, by using some very efficient solutions I have been using to eliminate the risks of rt priorities misuses without breaking the paradigm. For me that's the easy thing (and proven already), although some related ppl at MS had a very hard time to understand it. The real challenge is to handle DPCs as preemptable real time threads, fully respecting assigned rt priorities. This is what would fix the problem definitely, but since currently DPCs are not preemtable, there is a possibility of breaking some drivers because access serialization or sync issues. But I expect this possibility will be in real world very low, as there should not be interactions between DPCs created by different drivers out of system control, and ultimately, the new scheduler will be very flexible, allowing to configure each driver's DPCs parameters separately, so DPC preemption could be disabled for each particular driver or for all of them. Probably the default config will be fully compatible with current behavior, but I expect that enabling DPC preemption by lowering the priorities of selected driver's DPCs should work fine with the most or almost all drivers. I could post a more concise description of the plan, but what I intend ultimately to achieve is a fully working and tested ntoskrnl that could run on regular xp or s2003 too (to verify full compatibility). So maybe instead of writing the details and discussing it, we may wait till I have it working and speak again afterwards. This way I won't be wasting the time of anyone else until the objectives are achieved and can be verified. P.D: Win7 has again improved the scheduler in the right direction (and fixed a related big mistake in the Vista's WaveRT model), but the DPC thing still kills the latency and they don't want to change that. I already discussed it with some key ppl at MS: a few understood it, but even those didn't think there were good chances of convincing decision makers anytime soon. It is considered a very low priority, if not null at all. But for all pro audio manufacturers, and some other niches like automation and control, it is a top priority. Best regards, Jose Catena DIGIWAVES S.L. -Original Message- From: ros-dev-boun...@reactos.org [mailto:ros-dev-boun...@reactos.org] On Behalf Of Alex Ionescu Sent: Tuesday, 22 December, 2009 04:10 To: ReactOS Development List Subject: Re: [ros-dev] MSVC If your new implementation: 1) Is better than what Windows does today (hint: it's nearly lockless in Win 7, and O(1) since 2003) in every single way (ie: not sacrificing 50% of desktop users for 10% of server users). AND 2) Maintains full compatibility with Windows applications (and I expect you to TEST this), drivers, etc in every way. I promise you I will wholeheartedly support its inclusion in ReactOS. In fact, I will do even more than just that. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Re: [ros-dev] [ros-diffs] [sginsberg] 42829: - svchost
// if even If (!x 1) If the code and the comments disagree, then both are probably wrong. ;) Indeed, I was wrong in the even example. It's: if (~x 1) // or if (!(x 1)) And yes, I just saw it was already implemented, sorry. Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Re: [ros-dev] [ros-diffs] [tkreuzer] 42353: asm version of DIB_32BPP_ColorFill: - Add frame pointer - Get rid of algin_draw, 32bpp surfaces must be DWORD aligned - Optimize the loop - Add comments
Below my C code based on the C code previously shown here, and the assembly generated by vc. This function, as most ones, does not benefit much from asm coding, although some cycles can be saved, most notably inside the loop (a cmp and additional branch in vc generated code). Some algorithms can benefit a lot from asm, though. For example the Fletcher checksum or incrementing/decrementing variables larger than the register size, where the use of the carry flag can save many cycles. Also when a function exec time is very critical may deserve asm coding, but I think in this case it does not worth it, as the saving in percentage is tiny (any compiler I know will use rep stosd for the inner loop, which has the largest weight in the total time). BOOLEAN DIB_32BPP_ColorFill(SURFOBJ* pso, RECTL* prcl, ULONG iColor) { LONG lDelta, cx, cy; char * pulLine; lDelta = pso-lDelta; pulLine= (char *)((char *)pso-pvScan0 + prcl-top * lDelta + (prcl-left 2)); cx = prcl-right - prcl-left; if (cx = 0) return TRUE; cy = prcl-bottom - prcl-top; if (cy = 0) return TRUE; ULONG *p; ULONG c; for(; cy--; pulLine += lDelta) { for(p = (ULONG *)pulLine, c = cx; c--; ) { *p++ = iColor; } } return TRUE; } PUBLIC ?DIB_32BPP_ColorFill@@YAEPAU_SURFOBJ@@PAU_RECTL@@k...@z ; DIB_32BPP_ColorFill ; Function compile flags: /Ogtpy _TEXT SEGMENT ?DIB_32BPP_ColorFill@@YAEPAU_SURFOBJ@@PAU_RECTL@@k...@z PROC ; DIB_32BPP_ColorFill ; Line 52 mov ecx, DWORD PTR ds:4 ; Line 54 mov edx, DWORD PTR ds:8 pushebp mov ebp, DWORD PTR ds:36 imulecx, ebp xor eax, eax mov eax, DWORD PTR [eax] pushesi lea esi, DWORD PTR [ecx+eax*4] add esi, DWORD PTR ds:32 sub edx, eax ; Line 55 testedx, edx ; Line 56 jle SHORT $l...@dib_32bpp_ pushebx ; Line 58 mov ebx, DWORD PTR ds:12 sub ebx, DWORD PTR ds:4 ; Line 59 testebx, ebx ; Line 60 jle SHORT $l...@dib_32bpp_ pushedi npad4 $l...@dib_32bpp_: ; Line 64 dec ebx ; Line 66 testedx, edx je SHORT $...@dib_32bpp_ mov ecx, edx xor eax, eax mov edi, esi rep stosd $...@dib_32bpp_: add esi, ebp testebx, ebx jne SHORT $l...@dib_32bpp_ pop edi $l...@dib_32bpp_: pop ebx $l...@dib_32bpp_: pop esi ; Line 72 mov al, 1 pop ebp ; Line 73 ret 0 ?DIB_32BPP_ColorFill@@YAEPAU_SURFOBJ@@PAU_RECTL@@k...@z ENDP ; DIB_32BPP_ColorFill In asm I would write the loop as: mov eax, iColor mov ebx, pulLine mov edx, cy L1: mov di, bx mov cx, _cx rep stosd add dx, lDelta dec dx jnz l1 Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Re: [ros-dev] [ros-diffs] [tkreuzer] 42353: asm version of DIB_32BPP_ColorFill: - Add frame pointer - Get rid of algin_draw, 32bpp surfaces must be DWORD aligned - Optimize the loop - Add comments
A correction of my previous msg: In asm I would write the loop as: mov eax, iColor mov ebx, pulLine mov edx, cy L1: mov edi, ebx mov ecx, _cx rep stosd add ebx, lDelta dec edx jnz l1 It is not possible to optimize the loop further AFAIK, and this only saves a cmp and jnz in the outer loop, a tiny gain. Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Re: [ros-dev] [ros-diffs] [tkreuzer] 42353: asm version of DIB_32BPP_ColorFill: - Add frame pointer - Get rid of algin_draw, 32bpp surfaces must be DWORD aligned - Optimize the loop - Add comments
That looks almost like the version I wrote, except I didn't use memory access inside the loop, but registers, which should saves some cycles. Yes, the same. That register vs memory does not make a difference, takes the same time if it is in the L1 cache (both just 1 cycle), and it will always be there. Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Re: [ros-dev] [ros-diffs] [tkreuzer] 42353: asm version of DIB_32BPP_ColorFill: - Add frame pointer - Get rid of algin_draw, 32bpp surfaces must be DWORD aligned - Optimize the loop - Add comments
With all respect Alex, although I agree with you in the core, that this does not deserve the disadvantages of asm for a tiny performance difference if any (portability, readability, etc), I don't agree with many your arguments. -- 1) The optimizations of the code *around* the function (ie: the callers), which Michael also pointed out, cannot be done in ASM. -- Yes, it can. I could always outperform or match a C compiler at that, and did many times (I'm the author of an original PC BIOS, performance libraries, mission critical systems, etc). I very often used regs for calling params, local storage through SP instead of BP, good use and reuse of registers, etc. In fact, the loop the compiler generated was identical to the asm source except for the two instructions the compiler added (that serve for no purpose, it is a msvc issue). It is actually in the calling overhead and local initialization and storage where I could easily beat the compiler, since it complies with rules that I can safely break. Furthermore, in most cases a compiler won't change calling convention unless the source specifies it, and in any case the register based calling used by compilers is way restricted compared with what can be done in asm which can always use more efficient methods (more extensive and intelligent register allocation). In any case, the most important optimizations are equally done in C and assembly when the programmer knows how to write optimum code and does not have to comply with a prototype. For example passing arguments as a pointer to an struct is always more efficient. -- 2) The fact if you try this code on a Core 2, Pentium 4, Pentium 1 and Nehalem you will get totally different results with your ASM code, while the compilers will generate the best possible code. -- There are very few and specific cases where the optimum code for different processors is different, and this is not the case. If gcc generates different code for this function and different CPUs, it is not for a good reason. There is only a meaningful exception for this function: if the inner loop can use a 64 bit rep stos instead of 32. And in this case it can be done in asm, while I don't know any compiler that would use a 64 bit rep stos instruction for a 32 bit target regardless of the CPU having 64 bit registers. -- 4) The fact that if the loop is what you're truly worried about, you can optimize it by hand with __builtinia32_rep_movsd (and MSVC has a similar intrinsic), and still keep the rest of the function portable C. -- It is not necessary to use to use a built in function like you mention, because any optimizing compiler will use rep movsd anyway, with better register allocation if any different. If inline asm is used instead, optimizations for the whole function are disabled, as the compiler does not analyze what's done in inline assembly. -- Also, gcc does support profiling, another fact you don't seem to know. However, with linker optimizations, you do not need a profiler, the linker will do the static analysis. -- Function level linking and profiling based optimization are very different things, the linker in no way can perform a similar statistical analysis. -- Also, to everyone sayings things like I was able to save a operand name here, I hope you understand that smaller != faster. -- The save of these two instructions improve both the speed and size. Note that the loop the compiler generated was exactly the same as the original assembly, only with those two instructions added. I discern where I save speed, size, both, or none, in either C or assembly. I wrote this not to be argumentative or confrontational, but just because I don't like to read arguments that are not true, and I hope you all take this as constructive knowledge. BTW, I hardly support the use of assemly except in very specific cases, and this is not one. I disagreed with Alex in the arguments, not in the core. Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Re: [ros-dev] [ros-diffs] [tkreuzer] 42353: asm version of DIB_32BPP_ColorFill: - Add frame pointer - Get rid of algin_draw, 32bpp surfaces must be DWORD aligned - Optimize the loop - Add comments
- A builtin / intrinsic != inline asm I never said that. I used instead. I apologize for not being clear enough. but how would you want to optimize rep stosd anyway? No way. That's what I said, possibly with the exception of using a 64 bit equivalent if we could assume that the CPU is 64 bit capable. But Alex knows better, he's is calling me an ignorant. He says that L1: Mov [edi], eax Add edi, 4 Dec ecx Jnz L1 Is faster than rep stosd Both things do exactly the same thing, the later much smaller AND FASTER in any CPU from the 386 to the i7. And he shows an irrelevant portion of code to prove nothing regarding what I said, BTW we don't know what his compiler generated for the loop. In other cases he changes the meaning of what I wrote, corrects something I didn't say at all, or make unbased assumptions. I'm not going to answer him, LOL! This would be an endless loop. Anyway I always agreed with him in that asm is not helpful in this and most cases. This discussion is a waste of time. I thought from previous posts that he had better knowledge, and perhaps he has, but certainly does not know much of assembly and CPU architectures, yet he pretends and doesn't like to be corrected... bad for him. none of the compilers I tested was able to generate a rep stosd from either a loop or memset LOL, are we really in 2009? Try the C source I posted, it should be compiled as rep stosd. MSVC and Intel certainly do regardless of the target CPU, and not precisely since recent versions. Let me know if yours doesn't, I won't like a compiler that doesn't do such a basic and evident optimization. Most often I know pretty well what a compiler will generate without looking at the generated asm, the way C code is written matters in some cases. As for memset, MSVC inline memset will generate rep stosd and possibly a stosw and/or stosb if the byte count is not a multiple of the max size or non constant, what's ok. The library version also uses the same, with the call overhead. Anyway memset is not suitable here, it is for 8 bit and wmemset for 16 bit values, while we want to store 32 bit values. Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Re: [ros-dev] On the growth of the reactos project
I think I would be a good example of a potential new developer, one that would be valuable and interested in the project. So perhaps my opinion may count to an average of the way others like me would see ReactOS. Maybe my opinion might be also valuable because I have been managing large scale projects for 12 years (embedded and distributed OSs, RT, mass transport automation, electric distribution networks automation, robotics, scientific satellites, etc), but never with open source projects, so take it accordingly, I don't intend that my opinion is necessarily valuable for ReactOS. The goal is very, extremely attractive, and once wine is working, we think with reason that a very meaningful part is done. I'm sure feel very attracted. But after studying the code a bit and following the mailing lists for a while, the perception of organization and progress is poor, what introduces the doubt: there is a lot of work to be done, would my effort be wasted? I assume that every developer in the team is motivated by the goal, while each one may have specific areas of interest, preference, of experience. Everyone would understand that their own efforts would be a waste if the whole project does not success. Everyone needs a minimum infrastructure and organization to progress efficiently. So, everyone should understand the need to address high priority areas that are slowing down everyone. The highest priorities IMHO are: 1) Documentation. I know most devs like me hate having to write it. But complex projects can't progress well without a minimum well structured documentation. Someone has to define a basic hierarchy and rules. Whenever someone knew or learned anything that is undocumented, update the docs, so that others can work coherently. Maybe some of you think there is some, but to me (please remember my outsider point of view) it seems very poor. 2) Drivers installation. The drivers needed to run ReactOS in any hardware or VM are already available. It is prioritary, at least, to have .inf parsing and adv/setup api working well. If developers can't start the OS in the platform they need to test hardly will anyone be efficient, if does not leave the effort at all. Of course drivers can fail after succesfully installed, as so many things, but bugs in .inf installation are first time stoppers and need to be addressed asap. 3) Bug reporting and tracking database. I'm sure you all agree regarding the necessity of a well designed one. Someone has to do it, a project like this can not progress far without it. Again, take this with a grain of salt, I don't know ReackOS like most of you, I intended more to show first time perceptions that may atract or not, than being accurate in the criticism. After explaining my perception, I tell you that I sincerely want to contribute as soon as I can, I just don't have enough time currently. Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Re: [ros-dev] Very nice.
I also think it may be not a good idea to discuss physics here. But since this is going on anyway, I think I may help to find an agreement and end this. Please forgive me and skip the rest of this post if you aren't interested. I think everyone participating in the discussion knew what the centrifugal force is, and all the confusion could be explained by assumptions on what the 'real' definition is, since the natural language adjective is ambiguous as a scientific definition. Perhaps this happened with more than one word. It is not important if we name a force 'real' or 'abstract', as long as we understand what it actually is and agree on definition of 'real', what is not obvious as proven in the discussion. Forces are vectors (forget for now that 'vector' is in itself an abstraction). For analysis, many times natural forces are expressed as sum of vectors. Of course this makes another abstraction, every time we apply math we are abstracting in some way. Although it is also a valid meaning of 'real' something that can be demonstrated mathematically. Instead of centrifugal force, I'll put an example that I think is pretty useful to show the conflict: the force the air causes to a wing moving on it. The 'real' or single force vector is not useful for understanding or analysis, so we always express it as a sum of two vectors: 'lift' and 'drag'. I never hear of anyone saying that these are not real since their effect is demonstrable, while most people would consider the single natural force an abstraction, since it is demonstrated and understood as the sum of the two calculable components lift and drag. Furthermore, we often consider how a sum of forces are applied to an object, while at the end, only the sum is what matters to the effect produced, and in this case we might say this sum is an abstraction while the components are 'real' or not (they may be also abstracted components of a real force)... You see how assuming the meaning of words may be so confusing in science. As I see it, the problem in the discussion can be solved by delimiting the meaning of 'real', or much better, avoiding this ambiguous adjective in favor of a more explicit one. Indeed the centrifugal force is not directly caused in the nature, so if we say 'real' means a direct representation of something that physically exist, it is not 'real'. But we may also say that 'real' means it can be demonstrated mathematically, and then the centrifugal force is certainly 'real', as would 'lift' and 'drag' in a wing. Both are valid meanings of 'real' in different scopes (physical existence vs. scientifically demonstrable). Something that unfortunately happens too often. In any kind of discussion, people should try to understand each other's reasoning instead of nit-picking on definitions to discredit them. It is different than an exhaustive essay where the author should be careful to clear any ambiguity. We couldn't speak if we had to be scientifically accurate in every word. Sorry for the long and off topic post, I advised to skip it if no interest... ;-) Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
[ros-dev] Introducing myself
I am sorry for not being able to contribute for now, but I heartly intend to do it when possible with your permission. I do well with asm, c c++, have experience in windows including drivers, have developed RT-OSs for embedded systems. I always wanted to substitute the process/thread scheduler and dpc handling by compatible but much improved versions based in my experience, and I hope you will find it useful once I could get ready to participate. For now, thanks to all for this effort. The world needs ReactOS! Jose Catena DIGIWAVES S.L. ___ Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev