Hacking and Other Thoughts

Wed, 09 Jul 2008

Grinding away on TX multiqueue

The first batch of TX multiqueue patches went into net-next-2.6 today. Nothing real interesting, just infrastructure pieces. Part of the long road to the destination.

So then I turned towards the task of hardening up the remaining patches and the sequencing isn't all that easy.

The existing multiqueue code I want to keep working until I "flip the switch" into the new stuff. But part of what the existing code does is manage the queues in a two-stage manner.

There is a global queue that controls the flow control of the entire device. And underneath there are per-queue controls. Both have to be enabled in order to send packets into a particular queue.

In the new stuff there is just one control per queue for better parallelization and scaling. The new per-queue datastructure holds the flag bits that manage the per-queue flow control state.

First there is a patch I add that transitions over to allocating multiple instances of the new datastructure, one for each TX queue, instead of having just one. But I left the existing multiqueue flow control routines using the egress_subqueue[] state bit array.

At that point the normal netif_stop_queue() et al. functions operate on TX queue zero in the new datastructures. This keeps everything working. And for non-multiqueue aware drivers (every driver we have, sans about 4 or 5) that is exactly what we want those interfaces to do.

But I want to migrate the multiqueue interfaces to use the per-TX state bits too. However, that doesn't work until both the generic networking, and the multiqueue drivers, stop using the compat netif_*_queue() routines. They have to be changed to only operate on the per-queue flow control state bits.

The current plan is to hit the drivers and the core code at the same time, all in one changeset. It's the only way I can come up with that doesn't break things at some intermediate stage.

Another fly in the ointment is the wireless QoS code. It uses the existing multiqueue infrastructure, and it's therefore something else I want to prevent from breaking mid-stream.

Beaux-Arts and kernel hacking...

My recent hobbies have included an intense study of New York City architecture, and in particular the facinating stories behind the city's two most prominent train stations. That being Grand Central Terminal and the arguably infamous Pennsylvania Station .

In the second half of the 19th century and on towards the first half of the 20th century, any American architect worth his salt studied at the Ecole des Beaux-Arts in Paris.

If you had a degree from that school, you were at the top of the pile for selection on all of the interesting commisions of the time. The school presented the student with a challenging and fast paced curriculum.

Firstly, for these American students attending in Paris, the first challenge was just getting in. The entrance exam (of course) required at least some proficiency in French. Several of the most notable American architects had the retake this entrace exam 5 or more times before being able to pass.

Once accepted, the student was pressed to solve problems. 12 hours were given to draft up a solution to a real architectual problem. Then once the draft was accepted, the student had 2 weeks to flesh out all of the details and present the final design. All the while the student's progress was critiqued by an established French architect who oversaw a group of students.

We really don't have that kind of training for computer science people. It's not even science I would say. This kind of training does exist for pure mathmatics, espcially in France.

Envision a school where you're asked to draft up the design of a compiler pass in 12 hours, then for two weeks you implement it, and meanwhile Alfred Aho critiques your work. This kind of place simply doesn't exist. (Yes I know Alfred teaches at Columbia currently, so maybe this specific place does exist :-) but I maintain that more generally such institutions do not exist)

Open source development and "throwing the masses of monkeys at the problem" seems to be a logical consequence of this, does it not?

A formally trained Beaux-arts architect and a room with a few drafters could design something as insanely complicated as a huge transportation hub in the middle of New York City. And it would work, as there would be no room for failure. To me it seems that someone similarly trained could do a complete operating system, compiler, or similar large software engineering task.

McKimm, Meade, and White were three architects and a couple drafters, and yet they were able to complete works such as Boston Public Library, Pennsylvania Station, and the James Farley Post Office. To just name a few.

So which is better, strict formal training and mentorship or open source monkeys? You decide!

Thu, 05 Mar 2009

A Sparc JIT for ioquake3

I recently got DRM working on sparc64, and this means I had to of course test it :-)

I played around with ioquake3 and it worked just fine. This was in a way exciting since I have devoured many hours of my life into this game on x86.

One aspect of quake3 is that it has a virtual machine. You can write a MOD for quake3 and replace pretty much any aspect of the game outside of the rendering engine. But to keep MOD authors from eating people's home directories, sending your password out to some rogue collection system, and things of that nature the interfaces are tightly controlled and the MOD code runs in a JIT'd VM.

The only way you can get into the JIT is to make a "system call". And the only way to get out is to either return or make such a system call into another module. The system call is the main entrypoint into the module, it takes an integer command and 0 or more integer arguments.

All memory accesses done by the JIT'd code are masked so that it is impossible for the JIT to touch memory outside of that allocated explicitly for it by the VM.

Ben H. kiddingly said to me that I should write the Sparc JIT since there is one for x86 and PowerPC already. One should never kid about such things...

It's pretty neat stuff, although the stack machine VM code output by the LCC compiler they used is horribly inefficient. Some code for a function might look like:

OPCODE[  OP_ENTER] IMM4[0x0000001c]	! ENTER function, 0x1c of stack
OPCODE[  OP_LOCAL] IMM4[0x00000024]	! PUSH stack offset 0x24 (first arg)
OPCODE[  OP_LOAD4] 			! LOAD from "stack + 0x24"
OPCODE[  OP_LEAVE] IMM4[0x0000001c]	! LEAVE function, return LOAD result

Operations push entries onto the "register stack", and consume entries on the top of that stack. This might emit some sparc code like:

	save	%sp, -64, %sp		! OP_ENTER
	sub	%g3, 0x1c, %g3
	add	%g3, 0x24, %l0		! OP_LOCAL
	and	%l0, %g5, %l0		! OP_LOAD4
	ld	[%g4 + %l0], %l0
	add	%g3, 0x1c, %g3		! OP_LEAVE
	ret
	restore	%l0, %g0, %o0

We use several fixed registers, "%g3" is the stack pointer, "%g5" is the VM data segment offset mask, "%g4" is the data segment base address. So every load or store address formation is "mask with %g5 and add to %g4".

It's there in the ioquake3 repo right now and will be in the next release. There are lots of things that can be improved but it works very well and most of the quake3 MODS I've tried (CPMA, UrbanTerror, etc.) work. I've also been playing the base game online extensively, you know, for stress testing.

[linuxkernelnewbies] DaveM's Linux Networking BLOG