Hi, On Thu, 2006-12-28 at 15:24 +1300, Ravi Chemudugunta wrote: > I have recently started using the open embedded repository to build a > distribution for a mini-cluster that I am building at the moment. > While there is some documentation about how to use bitbake, I > personally found it hard to find documentation on how bitbake worked > from the base up, while there are some good discussions on here on the > inner workings of bb, in code documentation is pretty minimal. > > I have started reading through the code to try and figure out how bb > works and what the different componants are. If anyone knows any > resources that could help me, please do let me know. I am looking at > maybe documenting the code 1.6.x and would like it if it was helpful > to everyone; I have used doxygen in the past but would very much like > to hear what you would like to use.
The users manual to bitbake is http://bitbake.berlios.de/manual/ and is generated from the docs in bitbake svn. There is no corresponding developers manual, just the comments we've slowly been adding to the source code over time. If you are going to document code, please do so on trunk which can then filter into the stable releases. Note that trunk functions rather differently to 1.6 in a few places due to the multithreading (the addition of the runQueue and taskData). > Regarding the sqlite db backend, I had a query. Were the tests > conducted using in-memory db or a filesystem db ? From past experience > with real-time applications I found sqlite performance to be > especially _very_ slow on writes using db's stored on the filesystem, > sqlite finalizes every single operation and has no buffering (it may > be different now). However as someone no the list mentioned python > dictionaries are probably almost always faster than doing it on a sql > back-end. However sql might bring to the table, a stricter structure > and the ability to tie together sophisticated queries - to ease > tedious operations like joins. > bitbake has a very simple data model which just uses set and get operations. With that current model, there is no advantage to having search and join operations and having tried it for the taskData implementation, I don't think SQL is ever going to be the answer to any of our problems as its orders of mangitide slower (disk or memory based). > I was wondering whether anyone has considered using a cluster to speed > up builds ? A very simple approach would be to have the base directory > mounted over nfs over all nodes, and then dispatch jobs by running rsh > using a scheduling algorithm. I have seen that there is a parallel > build option within bb but I haven't played with that option. As I > understand bb can run jobs on <n> number of threads and provided that > the machine you are running it on has more than one core a speedup can > be achieved. Its been considered. We'd use a common nfs mounted directory for parts of the TMPDIR (staging, cross, stamps) but $WORKDIR could be local for speed as long as all tasks for a given target went to a machine. There would be a bitbake daemon running on each machine which would take the tasks to build. > I am wondering whether multi-threading is going to be used in > bitbake-ng. Because most machines have two cores and to the most > four, the approach of multi-threading (shared memory programming) is > not very scalable when it comes to clusters. While there are > abstractions available that will make a cluster of computers appear as > one enabling multithreading to run transparently, it is hard to > squeeze power out using smp programming. Rather I would use > message-passing systems using standard tcp/ip style ipc, and because > of the class of application (long process cycle, short messaging in > between) using message passing wouldn't impose very much overhead on > purely smp machines. The multithreading runs a task per thread. Nothing says this thread has to run on the local machine :). The multithreading code should allow us to submit tasks to remote machines as mentioned above, that bit just hasn't been written yet. Also, note that even two task threads on a single processor is beneficial for things like do_fetch where it causes network IO, not processor IO. Other tasks can be disk IO heavy, not processor IO heavy. An aim is to have the single core bitbake eventually run say 3 fetchers with two limited to do_fetch tasks only. Cheers, Richard _______________________________________________ Bitbake-dev mailing list [email protected] https://lists.berlios.de/mailman/listinfo/bitbake-dev
