[Rpm-maint] [rpm-software-management/rpm] RFE: eliminate all remnants of rpm network access in man pages and code (#521)
Again this RFE is a result of assessing how to send a patch to simplify lib/rpminstall.c It is very clear that years of refactoring has managed to remove almost all ability for rpm to access the network (the one exception I can find is in lib/rpminstall.c) Meanwhile the rpm.8 man page still documents --ftpport proxy overrides and there are utterly useless macros for hkp:// pubkey retrieval and defaults that cannot possibly have worked for years. It's your trash now, not mine. I'd suggest biting the bullet and hauling out gobs and gobs of uselessness in rpm code that likely has not worked for almost a decade. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/issues/521___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
[Rpm-maint] [rpm-software-management/rpm] RFE: eliminate Berkeley DB by switching to either LMDB or ... (#520)
This RFE is an attempt to document two important remaining known issues that need to be solved to use LMDB in production. The issues are: * file path indices may exceed limits imposed on the size of keys by LMDB The solution was already suggested by Howard Chu in Fedora bugzilla: use a hash on the directory name instead of the actual directory name. The hash computation of the directory name can be handled in the LMDB backend without more pervasive changes (though there are space and I/O and consistency benefits to using a directory name hash everywhere) * header instances used as keys into Packages need to become big endian for btree functionality A balanced tree expects big endian integer keys so that keys will be equally distributed. Using little endian keys will cause key pile ups that void algorithmic guarantees. If there are other known LMDB implementation issues, please add. There are of course further issues converting user databases everywhere, but those issues cannot be blamed on the backend chosen to replace Berkeley DB. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/issues/520___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] RFE: run rpm scripts on a different thread using MQTT pub/sub message queues (#519)
To forestall the obvious objections to using MQTT, please note that I also have working code for SysV messages, POSIX message queues, AMQP, ZeroMQ, etc. I also am quite capable of an implementation of asynchronous RPC similar to above in XMLRPC, jabber, or UNIX domain sockets, or protocol du jour. The type of transport used is not the issue. The example provided provides an outline of a solution to a known problem in order to ask an important question. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/issues/519#issuecomment-413930567___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
[Rpm-maint] [rpm-software-management/rpm] RFE: run rpm scripts on a different thread using MQTT pub/sub message queues (#519)
This RFE tries to provide a concrete example to known bottlenecks with rpm install/update mentioned in issue #517. MQTT is Yet Another Message Queue (YAMQ) like ZeroMQ, or AMQP, or M$ MQ developed by IBM with a client implementation maintained in the Apache Foundation. Message queues are typically (ZeroMQ is an exception) implemented with a broker to ensure reliable delivery of messages in sequential order, using publish/subscribe methods for producers/consumers. With MQTT the subscribe consumer is delivered a message through a callback on a different thread. One approach to running scripts on a different thread would involve creating both a publish/subscribe mailbox within RPM, and sending the script to be executed through MQTT to be run asynchronously. The subscribe code would take the message body (I.e. the rpm script to run) and either invoke /bin/sh or run embedded lua on a different thread. The return code would then be sent back to the original publisher to be collected. Using MQTT in this fashion is just an obvious implementation of asynchronous RPC. The benefit comes from the simplicity by which the scriptlet runs on a different thread, and rpm execution can proceed without blocking on waitpid, nor implementing thread pools, or using locks, or worrying much about thread safety of rpmlib since MQTT messages hide all the gory details. I have working code for MQTT, the refactoring to achieve asynchronous execution is obvious, and I can provide an implementation, with measurements, if there is interest. Yadda yadda. The real point of this RFE is to supply a concrete illustration of an alternative path to parallelism for issue #517 and to reask the question: What implementations are deemed acceptable to RPM? -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/issues/519___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] RFC: What approach to improving performance through threads or non-blocking I/O is acceptable in RPM? (#517)
I was referring to using fsync wrapped in an event loop (and on rotating media), and was referring to rpm, not system, performance. Avoiding blocking system calls by using non-blocking alternatives in an event loop is what nodejs does. Rpmbuild and rpm both do loops over package operations and some of the operations are costly. I'm not sure why you consider that comparison "weird", although certainly rpmbuild and rpm are performing very different tasks. Running scripts within an event loop to avoid rpm blocking on waitpid, or using a thread pool so that scripts will run in parallel on multiple CPUs would seem to be one approach to solving the bottleneck you mention, but I have no numbers either. Post transaction file triggers involve loops that might benefit from parallelism. Comparing existing script execution to a proposed file trigger alternative is rather irrelevant to the topic asked here. Thread safety can always be achieved with a "big lock" that guarantees thread safety by permitting only a single thread to use rpmlib at any point in time. The added complexity is minimal for a "big lock" implementation, and a "big lock" does guarantee thread safety. The point of this RFC was to ask whether, say, OPENMP or POSIX threads implementations were preferred if/when threading is attempted to solve some perceived bottleneck. I suspect that we might agree that having multiple, organically grown, thread paradigmsadds a large (and mostly unnecessary) complexity. Since you are not aware of any existing bottlenecks with rpm install/upgrade, I'm not sure further discussion is useful. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/issues/517#issuecomment-413899939___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] RFC: What approach to improving performance through threads or non-blocking I/O is acceptable in RPM? (#517)
Mixing rpmbuild and rpm installation in the introduction is kinda weird as both are very different things. The added fsync actually lowers the performance of rpm to leave more air to the rest of the system. So it is not a "performance tweak" either. Focussing on rpm installation/update here: So far I am not away of any obvious bottle neck in installation that made it on my "this needs fixing list". It is known that scriptlets have been using a significant amount of installation time. This may have improved with switching to (posttrans) file triggers. But I don't have any numbers on that. It also is not obvious to me that parallel execution will yield a performance gain that will justify the added complexity. Nevertheless thread safety should be improved to allow librpm to be used in threaded environments. I am open to discuss specific parts of the code and whether they are bottle necks. But that'd require some numbers. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/issues/517#issuecomment-413847900___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint