Re: Archiving information, was Re: ADM-3A question
On 16/08/2019 16:33, Noel Chiappa via cctalk wrote: There is an automatic backup system which sends copies to a machine at his house, so the particular scenario above (hosting sevice goes away with no warning) is not an issue. (Yes, a Chicxulub event in Scandanavia would defeat that, but we'd all probably have larger problems to worry about!) After the first event, I make manual backups here of all the articles I contribute. The biggest concern is if he has an unfortunate interaction with a truck. I did raise this issue with him, and he had some initial suggestions, but I haven't followed through. If people start contributing, it'd probably be time to formalize something. I think having it mirrored would be a smart move. bitsavers is mirrored, manx nearly vanished but is now online (although I've just noticed that it's hosted on codeplex ...). It is possible to Special:Export each page via a script but it would be much easier to have the existing backup mechanism make copies available to multiple people. (It's easy to install mediawiki, so testing the backup occasionally should be straightforward). Antonio -- Antonio Carlini anto...@acarlini.com
Re: Archiving information, was Re: ADM-3A question
On Fri, 16 Aug 2019, ben wrote: Well with me I have been finding with many searches, the modern browsers refuse to display sites for "what they figure is unsafe" yet the porn ads still show. I can find it, but not view it. I use current versions of Firefox and Chrome/Chromium, and I don't have any problems viewing sites. And for your porn problem, what about using uBlock and NoScript? Christian
Re: Archiving information, was Re: ADM-3A question
> Then imagine that a law is passed in a far away land, and the site owner > decides it's is too risky to bother with, and they then take the entire > site down - wiki and fora - with no warning and no access to the material... Gosh, Steven, I can't imagine for the *life of me* what site you're referring to here. I mean, it's not like Some Guy Immediately pulls it all down on a whim, is it? ;) -- personal: http://www.cameronkaiser.com/ -- Cameron Kaiser * Floodgap Systems * www.floodgap.com * ckai...@floodgap.com -- God may be subtle, but He isn't plain mean. -- Albert Einstein -
Re: Archiving information, was Re: ADM-3A question
One of the problems with archiving is what to do with items that are not popular. Some things might be more valued ten or twenty years in the future but not now. Is the fact that the item has relatively low interest now a possible reason to not archive it in a searchable form for future reference? What about things that are scattered on other personal sites currently that may be gone next week? So much information is already lost. Who determines what should be saved? What say you come across a rare document but the copy was poorly done at a lower than desired resolution. Do you refuse to post it because it doesn't meet your standards or do you post it with a note that it is the best to date? Judging such things can be arbitrary and be the reason for lost information. At least when you publish a book, there is a chance that some copy may be saved. Now with information sitting on someones disk drives, it could be deletes with one mistake. This is a really complicated issue. I'm getting older and know I'm on the tail end of my life. Still, I have no way to begin to pass on what I have. I doubt my heirs would care much unless it had significant monetary value. Dwight From: cctalk on behalf of Seth J. Morabito via cctalk Sent: Friday, August 16, 2019 8:31 AM To: General Discussion: On-Topic and Off-Topic Posts Subject: Re: Archiving information, was Re: ADM-3A question Paul Koning via cctalk writes: > Anything worth having around deserves backup. Which makes me wonder > -- how is Wikipedia backed up? I guess it has a fork, which isn't > quite the same thing. I know Bitsavers is replicated in a number of > places. And one argument in favor of GIT is that every workspace is a > full backup of the original, history and all. > > One should worry for smaller scale efforts, though. This is a problem I think about a lot. In the early 2000s I worked on the LOCKSS program at Stanford University. LOCKSS stands for "Lots Of Copies Keep Stuff Safe", and is a distributed network of servers that replicate backup copies of electronic academic journals. It stemmed from a research project that looked at how to design an attack resistent peer-to-peer digital archival network. Each node in the network keeps a copy of the original journal content, does a cryptographic hash of each resource (HTML page, image, PDF, etc.), and participates in a steady stream of polls with all the other nodes where they vote on the hashes. If a minority of nodes loses a poll, their content is assumed to be damaged, missing, or bad, and they replicate the content from the winners of the poll. It's designed as a "Dark" archive, meaning the data is there, but nobody tries to access it unless the original web content disappears. Then, the servers act as transparent web proxies, so when you hit the original URL or URI, they serve up the content that's now missing from the real public Internet. It's a neat idea. It's also open source, and unencumbered with patents. I've always thought a similar model could be used to archive and replicate just about anything, but it's just one of those things that nobody's ever gotten around to doing. >paul -Seth -- Seth Morabito Poulsbo, WA, USA w...@loomcom.com
Re: Archiving information, was Re: ADM-3A question
On Fri, Aug 16, 2019 at 02:21:36AM -0400, Noel Chiappa via cctalk wrote: [...] > Yeah, I added "CHWiki" to the text on the Main Page to make it a > little easier Because of curiosity, I tried. On gog: === chwiki - because gog discovers I type from Poland and "chwiki" looks like Polish word "chwili" (a genetivus of "chwila" which means "moment", or "a second", like "just a second"), so it gave me page full of stuff like "this moment is best" or "no better moment than her touch" (which even for native speaker sounds a bit too contorted, but gog just indexes whatever garbage local folk produce) === computer history wiki - fifth result on first page === gunkies - first link on first page On double duck: all the same, like above Please note, for me gunkies.org and http://gunkies.org/wiki/Main_Page are equals, so I assume finding gunkies.org counts. -- Regards, Tomasz Rola -- ** A C programmer asked whether computer had Buddha's nature. ** ** As the answer, master did "rm -rif" on the programmer's home** ** directory. And then the C programmer became enlightened... ** ** ** ** Tomasz Rola mailto:tomasz_r...@bigfoot.com **
Re: Archiving information, was Re: ADM-3A question
On 8/16/2019 1:50 AM, Christian Corti via cctalk wrote: On Thu, 15 Aug 2019, Noel Chiappa wrote: An additional issue, I think, is that Google is deprecating sites that use HTTP, versus HTTPS. I can't comment more, lest I start ranting at the utter Not true, in contrary, Google even crawls through FTP sites :-) Christian Well with me I have been finding with many searches, the modern browsers refuse to display sites for "what they figure is unsafe" yet the porn ads still show. I can find it, but not view it. Ben.
Re: Archiving information, was Re: ADM-3A question
> From: Steven M Jones > imagine that a law is passed in a far away land, and the site owner > decides it's is too risky to bother with, and they then take the entire > site down - wiki and fora - with no warning and no access to the > material... > .. > I would strongly suggest that if people are going to do something of > the scale you describe, they might want to consider setting up a > distribution or replication mechanism Past events have made me very concerned about this issue! On a couple of occasions, Tore (who runs the CHWiki) has forgotten to pay the DNS fee, or something similar, and it went off-line (the first time for a week, as he was off camping). Leading to total panic on my part when he wasn't reachable, about all the content I'd written! There is an automatic backup system which sends copies to a machine at his house, so the particular scenario above (hosting sevice goes away with no warning) is not an issue. (Yes, a Chicxulub event in Scandanavia would defeat that, but we'd all probably have larger problems to worry about!) After the first event, I make manual backups here of all the articles I contribute. The biggest concern is if he has an unfortunate interaction with a truck. I did raise this issue with him, and he had some initial suggestions, but I haven't followed through. If people start contributing, it'd probably be time to formalize something. Noel
Re: Archiving information, was Re: ADM-3A question
Paul Koning via cctalk writes: > Anything worth having around deserves backup. Which makes me wonder > -- how is Wikipedia backed up? I guess it has a fork, which isn't > quite the same thing. I know Bitsavers is replicated in a number of > places. And one argument in favor of GIT is that every workspace is a > full backup of the original, history and all. > > One should worry for smaller scale efforts, though. This is a problem I think about a lot. In the early 2000s I worked on the LOCKSS program at Stanford University. LOCKSS stands for "Lots Of Copies Keep Stuff Safe", and is a distributed network of servers that replicate backup copies of electronic academic journals. It stemmed from a research project that looked at how to design an attack resistent peer-to-peer digital archival network. Each node in the network keeps a copy of the original journal content, does a cryptographic hash of each resource (HTML page, image, PDF, etc.), and participates in a steady stream of polls with all the other nodes where they vote on the hashes. If a minority of nodes loses a poll, their content is assumed to be damaged, missing, or bad, and they replicate the content from the winners of the poll. It's designed as a "Dark" archive, meaning the data is there, but nobody tries to access it unless the original web content disappears. Then, the servers act as transparent web proxies, so when you hit the original URL or URI, they serve up the content that's now missing from the real public Internet. It's a neat idea. It's also open source, and unencumbered with patents. I've always thought a similar model could be used to archive and replicate just about anything, but it's just one of those things that nobody's ever gotten around to doing. > paul -Seth -- Seth Morabito Poulsbo, WA, USA w...@loomcom.com
Re: Archiving information, was Re: ADM-3A question
On 08/16/2019 02:50 AM, Christian Corti via cctalk wrote: On Thu, 15 Aug 2019, Noel Chiappa wrote: An additional issue, I think, is that Google is deprecating sites that use HTTP, versus HTTPS. I can't comment more, lest I start ranting at the utter Not true, in contrary, Google even crawls through FTP sites :-) I kind of wonder what this is all about? I mean, why do you have to encrypt today's weather report, a company's public web page, and such stuff. Just to waste CPU time? Jon
Re: Archiving information, was Re: ADM-3A question
> On Aug 16, 2019, at 6:14 AM, Steven M Jones via cctalk > wrote: > > On 08/15/2019 23:21, Noel Chiappa via cctalk wrote: >> I have on several occasions posted appeals to this list for people to >> contribute content to it, and gotten almost no response (with one notable >> exception), in terms of added content; that was a large part of why I merely >> mentioned it in an offhand way. > > I don't want to discourage anybody from contributing to this or any other > project. However... > > Imagine if you will that many people, over many years, put a lot of work into > pulling information together on a site with forums, and then distilling that > information into a lot of wiki pages. Many discussions in the forums, with > hard-won facts and interesting projects documented there. Things the > manufacturer(s) never admitted you could do! So many wiki pages carefully > explaining things, recording specifications, procedures, configurations, part > numbers, substitutions. An incredibly useful resource and a very active > community. > > Then imagine that a law is passed in a far away land, and the site owner > decides it's is too risky to bother with, and they then take the entire site > down - wiki and fora - with no warning and no access to the material... You don't even have to assume government malice. Lots of providers have gone out of business without any warning simply because of not being economically viable. Or even because the operators decided they weren't interested any longer. Anything worth having around deserves backup. Which makes me wonder -- how is Wikipedia backed up? I guess it has a fork, which isn't quite the same thing. I know Bitsavers is replicated in a number of places. And one argument in favor of GIT is that every workspace is a full backup of the original, history and all. One should worry for smaller scale efforts, though. paul
Re: Archiving information, was Re: ADM-3A question
On 08/15/2019 23:21, Noel Chiappa via cctalk wrote: I have on several occasions posted appeals to this list for people to contribute content to it, and gotten almost no response (with one notable exception), in terms of added content; that was a large part of why I merely mentioned it in an offhand way. I don't want to discourage anybody from contributing to this or any other project. However... Imagine if you will that many people, over many years, put a lot of work into pulling information together on a site with forums, and then distilling that information into a lot of wiki pages. Many discussions in the forums, with hard-won facts and interesting projects documented there. Things the manufacturer(s) never admitted you could do! So many wiki pages carefully explaining things, recording specifications, procedures, configurations, part numbers, substitutions. An incredibly useful resource and a very active community. Then imagine that a law is passed in a far away land, and the site owner decides it's is too risky to bother with, and they then take the entire site down - wiki and fora - with no warning and no access to the material... I'm not arguing against community collaborations at all - I guess I'm mostly just venting my considerable spleen. :( But I would strongly suggest that if people are going to do something of the scale you describe, they might want to consider setting up a distribution or replication mechanism at their earliest convenience. --S.
Re: Archiving information, was Re: ADM-3A question
On Thu, 15 Aug 2019, Noel Chiappa wrote: An additional issue, I think, is that Google is deprecating sites that use HTTP, versus HTTPS. I can't comment more, lest I start ranting at the utter Not true, in contrary, Google even crawls through FTP sites :-) Christian
Re: Archiving information, was Re: ADM-3A question
> From: Eric Christopherson >> Anyway, the whole 'how do we find the info' is a part of why I started >> working on CHWiki, once I discovered it > Psst: it would've been a good idea to share the URL to CHWiki. Well, that passing reference wasn't an attempt to get people to go look at it, hence no URL! :-) I was focused on the abstract discussion about 'how do we make information accessible, if relying on search engines to find blog postings doesn't work'. I have on several occasions posted appeals to this list for people to contribute content to it, and gotten almost no response (with one notable exception), in terms of added content; that was a large part of why I merely mentioned it in an offhand way. > a site I was already familiar with, but not under the name you used for > it. Ah, formally it's the 'Computer History Wiki', except that's a lot of typing, so I've been using 'CHWiki' as a short, easy-to-type, name for it for some time now. > (It was a bit hard to find with Google, which just goes to show...) Yeah, I added "CHWiki" to the text on the Main Page to make it a little easier to find from the short name, after a previous case where I'd used that term here, to some people's confusion. But I see it still doesn't work well; I guess I'll have to add 'CHWiki' links from more pages. Using 'Computer History Wiki' as a search term only works slightly better, though; it's at the bottom of the first page of results for me, below a bunch of Wikipedia links. Noel PS: In response to a point raised in a private reply to me; the site is for _all_ historical computers: personal computers, mainframes, the lot. I myself have added a lot of PDP-11 material, but only because I'm very fond of them, and know them well. The field of historial computers is _way_ too broad for one person to cover in depth, which is part of why I previously appealed to people who knew/were familar with other corners of it to add detailed content in those areas.
Re: Archiving information, was Re: ADM-3A question
On Thu, Aug 15, 2019, 7:38 PM Noel Chiappa via cctalk wrote: > Anyway, the whole 'how do we find the info' is a part of why I started > working on CHWiki, once I discovered it - in addition to the usual > advantages > of wikis (good for collaboration, good for adding stuff incrementally), it > would put all the info in one place, a 'one stop shopping' for old computer > info. > Psst: it would've been a good idea to share the URL to CHWiki. It's http://gunkies.org/wiki/Main_Page - the address to a site I was already familiar with, but not under the name you used for it. (It was a bit hard to find with Google, which just goes to show...) -- Eric Christopherson
Re: Archiving information, was Re: ADM-3A question
> From: Seth J. Morabito >> having stuff scattered across a zillion personal pages (be they blogs, >> or whatever) is going to make it hard to find the useful one when >> needed > The sheer vastness of content available, combined with a Google > monoculture, combined with a concerted attempt to GAME the Google > monoculture, is making search and discovery hard An additional issue, I think, is that Google is deprecating sites that use HTTP, versus HTTPS. I can't comment more, lest I start ranting at the utter stupidity of forcing everyone to use HTTPS. But if those blogs are using HTTP, that will push them down the results. > I honestly don't know what to do about it. I don't have a better idea, > unless we go back to something like a directory-style curated > experience, a-la Yahoo! circa 1998-ish. I'm not sure that would scale to cover detailed pages on obsolete computers; why is a manual indexer going to cover them? Anyway, the whole 'how do we find the info' is a part of why I started working on CHWiki, once I discovered it - in addition to the usual advantages of wikis (good for collaboration, good for adding stuff incrementally), it would put all the info in one place, a 'one stop shopping' for old computer info. But when I tried to convince people to post stuff there, instead of on their blogs, I got at least one person who was pretty vehement that no way in h*** were they going to stop putting their stuff in their own blog. Noel
RE: Archiving information, was Re: ADM-3A question
OTOH, there are vast quantities of old manuals, schematics, text books, etc. that get thrown out each year because no one will pay for them. I have had the unenjoyable experience of trashing boxes full of stuff because they did not sell. $1-5 is pretty cheap, considering the time to check the condition, photograph, list on website, pack it properly, and get it to the right place. If something has sat here for 23 years and not moved, it is soon going to go away. I filled up John's car last time he came down. I would much rather they went to a good home than the dumpster, but most people do not want the "clutter". -Original Message- From: cctalk [mailto:cctalk-boun...@classiccmp.org] On Behalf Of ben via cctalk Sent: Thursday, August 15, 2019 6:57 PM To: cctalk@classiccmp.org Subject: Re: Archiving information, was Re: ADM-3A question On 8/15/2019 4:33 PM, Marvin Johnston via cctalk wrote: > Instead of the search engines working to improve AI, they should be > putting more effort into ESP. > However with 'FREE' web hosting vanishing faster the Dodo, you have lost most of the Small sites that may of had the information. A blog tends lose things after the current year. > Marvin My other gripe, is technical books tend to revise for the latest trend in marketing. A fictional book like "Software tools for fools", Version #1 8008, Version #2 Z80 Version #3 386. Version #4 RISC machine #5 latest machine available only for Beta testing. * library has removed books that have not been checked out in the last 3 years. We can borrow the latest copy when comes in print from the main branch. Ben. --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus
Re: Archiving information, was Re: ADM-3A question
On 8/15/2019 4:33 PM, Marvin Johnston via cctalk wrote: Instead of the search engines working to improve AI, they should be putting more effort into ESP. However with 'FREE' web hosting vanishing faster the Dodo, you have lost most of the Small sites that may of had the information. A blog tends lose things after the current year. Marvin My other gripe, is technical books tend to revise for the latest trend in marketing. A fictional book like "Software tools for fools", Version #1 8008, Version #2 Z80 Version #3 386. Version #4 RISC machine #5 latest machine available only for Beta testing. * library has removed books that have not been checked out in the last 3 years. We can borrow the latest copy when comes in print from the main branch. Ben.
Archiving information, was Re: ADM-3A question
Al Kossow via cctalk writes: > On 8/14/19 8:53 AM, Anders Nelson via cctalk wrote: >> I hope this thread will be written to a blog post > > Buried in a filing cabinet in the basement with a sign that says > "Beware of Leopard". > > Blogs are a stupid way to archive information, almost as stupid as > putting it on Facebook. The problem is not archiving, but rather retrieving the data. As a current example, I am looking for information on the Jonas Escort computers. A slight misspelling (Jonas instead of Jonos) resulted in a whole slew of graphic escort services. And spelling it properly has resulted in basically zero useful information about the computer itself. It is hard to believe the almost total lack of information on the Jonos. If the scarcity is real, it must be worth at least as much as the Apple I :). And ditto for the Molecular Computer although not as bad as the Jonos. BTW, these are two computers I'm looking at bringing to VCFMW if there is any serious interest. Instead of the search engines working to improve AI, they should be putting more effort into ESP. Marvin