Hello, starting with your second point, I rest assured that a patch that helps you and does not harm or confuse others would be well received also by Gianfranco. We just need a very simple way to phrase that you want to present a machine as new to BOINC even though BOINC has already seen it.
I have installed BOINC on many machines and yet have not encountered any identity theft myself. In the contrary, an attempt to start BOINC as a regular user introduced a clone of the machine. I can somehow feel with you that something is a bit weird at times, but nothing reproducible on my side, yet. For your doppelganger issue I somehow feel that it is less a Debian thingy but something erratic in the boinc client code. What does "hostid" tell on the two machines that think to be the same? I admit not to know if hostid is used by BOINC, would make sense if not, though. Gianfranco may have extra insights. Best, Steffen On 22/01/2017 20:24, trueriver wrote: > apols if this is a double post, I am not convinced the first copy went out > OK. > ------ > > hi thanks Steffen and Christian for your kind words. > > I believe I am seeing hash collisions in cpid on installing the client in > Debian and Mint. I also believe the Mint package is the unchanged Debian > one, inherited via Ubuntu. > > The symptoms I am seeing are that when a new computer is added to my little > farm, it sometimes is taken by the PrimeGrid (PG)server to be an existing > host. > > This is bad for two reasons, and irritating for a third. > > 1. I cannot rely on setting the default location for new computers, because > the new machine will come up in whatever location the doppelganger had. > This means that it may download and start crunching work that, for example, > will run for longer than that host has really got. > > 2. If the doppelganger had work in progress, then that work is marked as > abandoned. That means that a new task is sent to someone else, wasting the > collective time of the project. > > (PG have installed two work arounds that ensure that if I go on crunching I > do not lose credit. If a task completes and is shjown as abandoned at the > time of completion, it is sent for validation as if it were not abandoned. > If a task trickles up then it reverts to being in progress or overdue, and > then when it subsequently reports it goes for validation. Providing either > of these happen before the WU is deleted from the server, the user gets > credit -- neither feature is standard on other projects, or so I understand) > > With credit assured, providing I finish the work, that gives me a moral > dilemma when the allegedly abandoined work is 10% into a 20 day task. If I > abort it I lose credit, but if i continue it I am getting the last 80% of > the credit for work I know is now being done by TWO other machines, which > is a waste of the project's resources. > > 3 (a lesser irritation) when I am testing out different settings (running > with and with hyperthreading, say) by mixing up historic hosts it makes it > harder for me to track which host was doing what when. > > I have seen this happen among three laptops, running LinuxMInt Mate 17.1, > Cinnamon 18, and Cinnamon 18.1. Two of these laptops have the same CPU > model, but i7-6500U, but the third has a model number that looks rather > different, m5 6y54. The cpus are similar in that they are all at the > expensive end of the mobile processor range, > > When this has happened with these laptops, each time the respective OS was > installed from live CD/USB, and boinc installed with synaptic, searching > for the boinc meta package. > > The first time it happened, March 2016, I was told that I had provoked the > problem by using the same usb ethernet dongle and the MAC address was > therefore the same. So I went out and bought another couple of dongles, and > labelled them for the respective machines. I honestly believe I have not > swapped them around indavertently. > > This week (jan 2017) the same happened again, involving one of the original > two laptops and one that had not been involved before. Different cpu, > different usb dongle, even different kernel versions as I had not ywt > updated the older machine's kernel at that time. Different manufacturer, so > different hardware on motherboard, etc etc. > > The oddest feature is that after updating from both laptops a number of > times, all of a sudden the server was showing them as separate machines, > and had correctly assigned all 8 tasks issued to the new machine to that > machine, and correctly assigned all the historic tasks and stats to the old > machine. > > So I am wondering how it did that. Perhaps it is not the cpid at all, > perhaps it is the server software being too clever? > > This effect also leaves oddities on the server, like this from my first > experience of this issue > > http://www.primegrid.com/show_host_detail.php?hostid=512618 > > as you can see the computer has a different creation and last contact time, > so you might think it had contacted the server at least twice. But by the > server's own count, it has done so zero times. Maybe you can see how that > makes sense (apart from it being a tunnelling effect of your quantum > computing module ;) > > > I am now told on the PG forum that "Linux sometimes fails to pick up the > MAC address". > > ALSO, I have seen this among my collection of 11 desktop machines, 2 of > which are identical apart from MAC address, and 1 is a NFS server, and 8 > are diskless loading their OS from the server using PXE and root=/dev/nfs. > > The server runs LinuxMInt 18.1, The other desktop machines run a minimal > Debian command line OS, netinstall plus ssh plus boinc-client. These are > cloned, but the boinc directories are re-initialised each time to contain > only the four config files in /etc/boinc-client and softlinks to them from > /var/lib/boinc, plus a minimal account_www.primegird.xml that provides my > weak auth code. In particular, there is no contamination of the <host_cpid> > value as the file that holds that value is not cloned. > > Running the diskless machines one at a time works fine, but it does seem > random whether it picks up thew history of its own hardware, or of one of > the other machines. I am not sur about this yet, I am still collecting data. > > In any case, my preference would be to start each freshly booted machine as > a new machine on the PG server, allowing me to merge them manually (I have > been around long enough to remember when that was the norm after a > re-install, and personally I preferred that). > > If I do not turn a machine on for months on end, when it is powered up I > want it to be at the default location, not wherever that physical hardware > was located last time it was used. I do understand the this old behaviour > changed because of specific requests from users who had their own reasons > for wanting hardware continuity. > > I am fairly sure there is at least one bug here, possbily a different bug > in the two scenarios. > > I find it interesting that so far, in ten months, there has not yet been a > case of confusion between a laptop and a desktop -- at some poiunt the > different hardware becomes sufficiently different to avoid ambiguity. > > So, FIRST I believe there is a bug that means that cpid is sometimes > independent of the MAC address. > > SECOND, I am requesting a user-selectble option that allows a user, at > instll time, to choose to switch hardware-continuity on or off. > > I believe the second could be achieved by asking a question in the > post-install trigger, and if the user wanted hardware continuity off, the > script would create a cpid based on a freshly generated uuid. There could > even be a three way option: hardware based, based on hostname, or based on > a fresh-every install cpid (the latter not beng a hash of anything on the > system but a random based uuid with the punctuation stripped out). > > This option would not be offered where the post install trigger found a > pre-existing stare file with a pre-existing cpid. > > As a work-around, that same option would solve the issues created if there > is, in fact, a bug in the client-generated cpid routine. > > Unless you guys can suggest a good reason why not, I intend to make this > change to the .deb on my own system, and see what happens. I have spent too > much time clearing up the messes that false allegations of "abandonment" > make -- by which I mean when tasks on a different set of hardware get > marked as abandoned. > > If you know of a reason why this idea is unwise, as a home project, please > let me know in the next few days. > > I am also offering this to you as something you may (or may not) want to > roll out more generally. > > I also do not know about making a bug report. This effect comes and goes, > and I can think I have an effective way to avoid it (as with buying new USB > dongles) then that can fall in a heap. I cannot (yet) produce a definitive > recipe to reliably demonstrate this effect, so up to know I have not filed > a bug report. Do you think I should? > > > > I would value your thoughts on any of the above. > > And if it does turn out to be provoked by me in a way I have not yet > thought of, I would be glad to know that, too. > > Is there anything else you need to know from me at this time? > > River~~ > > > On 22 January 2017 at 15:17, Steffen Möller <[email protected]> wrote: > >> Gianfranco is the more active one on the boinc Debian+Ubuntu packages, >> but, anyway, I do not think the other readers on this list mind you >> telling us about your concerns right here, in particular since this may >> also be relevant for packages of other distributions. So, go ahead. >> >> Steffen >> >> >> On 22/01/2017 14:45, Christian Beer wrote: >>> Hi, >>> >>> if this is a packaging related problem than it's better to directly >>> contact the package maintainer but the Debian maintainer is also reading >>> this email list so you may try it out here before opening a Debian bug >>> report. >>> >>> Regards >>> Christian >>> >>> On 21.01.2017 18:44, trueriver wrote: >>>> hi everyone, >>>> >>>> before I launch into a description and some questions, may I check this >> is >>>> the right place to ask about problems that seem to occur with running >> Boinc >>>> on multiple Linux machines? >>>> >>>> I am wondering, in particular, if the install triggers in the .deb can >> be >>>> improved to avoid a particular issue. I may be offering to assist with >>>> that, depending what the issue turns out to be. >>>> >>>> So, is this the right place to ask, and if not can you kindly signpost >> me >>>> to the right place please? >>>> >>>> regards, >>>> River~~ >>>> _______________________________________________ >>>> boinc_dev mailing list >>>> [email protected] >>>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev >>>> To unsubscribe, visit the above URL and >>>> (near bottom of page) enter your email address. >>> _______________________________________________ >>> boinc_dev mailing list >>> [email protected] >>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev >>> To unsubscribe, visit the above URL and >>> (near bottom of page) enter your email address. >> > _______________________________________________ > boinc_dev mailing list > [email protected] > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev > To unsubscribe, visit the above URL and > (near bottom of page) enter your email address. _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
