Re: [liberationtech] economic cost of lost emails.

2014-08-25 Thread Andy Isaacson
On Sun, Aug 24, 2014 at 04:40:26PM -0300, J.M. Porup wrote:
 If we really want a permanent archive of humanity's work, we 
 need to build some kind of distributed Noah's Ark. Archive.org is
 no good (book depositories are the first to go when the book-burning
 starts), and asking the book-burners at the NSA and GCHQ to guard
 our civilization's store of knowledge is laughable on its face.
 
 Something P2P, maybe blockchain-based, might work. Convincing people
 of the reality and urgency of the threat is another matter.

A diversity of tactics is best.  Given a useful archive of knowledge
(the complete wikipedia edit history in all languages, including deleted
and censored data, would be a good start), we need to store it in
hundreds of places --

 - on microsd cards in waterproof cases buried underground
 - on archival DVD-R
 - microprinted on metal foil or volume-optimized paper substrates,
   buried in obscure locations
 - on cubesats (but LEO is not a friendly place for multidecade
   storage, you need MEO at least to avoid deorbiting)
 - on long-term electronic storage media including naked-eye readable
   instructions on how to access it (what do you MEAN you can't read a
   Acorn LaserDisc!?)

Folks doing this should be cautious of being completely visible, since
in the hypothesized interregnum the lists of where the knowledge from
the past is will be target lists, both for the opressors to destroy and
for desperate exploiters to plunder.  A mix of projects --

 - some with explicit locations like these coordinates
 - some with vague lists like 200-300 locations in the continental US
 - some with no presence at all

is best.

Other information to consider including --

 - software implementations (the Debian archive and source code)
 - human language references
 - scientific datasets and paper archives
 - scientific source code and reproducibility instructions
 - farming data and scientific methods
 - practical how-to information such as Farmer's Almanac

-andy
-- 
Liberationtech is public  archives are searchable on Google. Violations of 
list guidelines will get you moderated: 
https://mailman.stanford.edu/mailman/listinfo/liberationtech. Unsubscribe, 
change to digest, or change password by emailing moderator at 
compa...@stanford.edu.



Re: [liberationtech] economic cost of lost emails.

2014-08-25 Thread Miles Fidelman



On Sun, Aug 24, 2014 at 04:40:26PM -0300, J.M. Porup wrote:

If we really want a permanent archive of humanity's work, we
need to build some kind of distributed Noah's Ark. Archive.org is
no good (book depositories are the first to go when the book-burning
starts), and asking the book-burners at the NSA and GCHQ to guard
our civilization's store of knowledge is laughable on its face.

Something P2P, maybe blockchain-based, might work. Convincing people
of the reality and urgency of the threat is another matter.




The library community has the right term for this:  LOCKSS (Lots of 
copies keeps stuff safe).


Miles Fidelman

--
In theory, there is no difference between theory and practice.
In practice, there is.    Yogi Berra

--
Liberationtech is public  archives are searchable on Google. Violations of 
list guidelines will get you moderated: 
https://mailman.stanford.edu/mailman/listinfo/liberationtech. Unsubscribe, change 
to digest, or change password by emailing moderator at compa...@stanford.edu.



Re: [liberationtech] economic cost of lost emails.

2014-08-25 Thread J.M. Porup
On Mon, Aug 25, 2014, at 03:03, Andy Isaacson wrote:
 On Sun, Aug 24, 2014 at 04:40:26PM -0300, J.M. Porup wrote:
  If we really want a permanent archive of humanity's work, we 
  need to build some kind of distributed Noah's Ark. Archive.org is
  no good (book depositories are the first to go when the book-burning
  starts), and asking the book-burners at the NSA and GCHQ to guard
  our civilization's store of knowledge is laughable on its face.
  
  Something P2P, maybe blockchain-based, might work. Convincing people
  of the reality and urgency of the threat is another matter.
 
 A diversity of tactics is best.  Given a useful archive of knowledge
 (the complete wikipedia edit history in all languages, including deleted
 and censored data, would be a good start), we need to store it in
 hundreds of places --
 
  - on microsd cards in waterproof cases buried underground
  - on archival DVD-R
  - microprinted on metal foil or volume-optimized paper substrates,
buried in obscure locations
  - on cubesats (but LEO is not a friendly place for multidecade
storage, you need MEO at least to avoid deorbiting)
  - on long-term electronic storage media including naked-eye readable
instructions on how to access it (what do you MEAN you can't read a
Acorn LaserDisc!?)
 
 Folks doing this should be cautious of being completely visible, since
 in the hypothesized interregnum the lists of where the knowledge from
 the past is will be target lists, both for the opressors to destroy and
 for desperate exploiters to plunder.  A mix of projects --
 
  - some with explicit locations like these coordinates
  - some with vague lists like 200-300 locations in the continental US
  - some with no presence at all
 
 is best.
 
 Other information to consider including --
 
  - software implementations (the Debian archive and source code)
  - human language references
  - scientific datasets and paper archives
  - scientific source code and reproducibility instructions
  - farming data and scientific methods
  - practical how-to information such as Farmer's Almanac
 
 -andy

Anyone know any dissident billionaires willing to fund such a
project? Maybe Pierre Omidyar would be interested...

Jens

--
J.M. Porup
www.JMPorup.com
-- 
Liberationtech is public  archives are searchable on Google. Violations of 
list guidelines will get you moderated: 
https://mailman.stanford.edu/mailman/listinfo/liberationtech. Unsubscribe, 
change to digest, or change password by emailing moderator at 
compa...@stanford.edu.



Re: [liberationtech] economic cost of lost emails.

2014-08-25 Thread Andy Isaacson
On Mon, Aug 25, 2014 at 04:24:02PM -0300, J.M. Porup wrote:
  Folks doing this should be cautious of being completely visible, since
  in the hypothesized interregnum the lists of where the knowledge from
  the past is will be target lists, both for the opressors to destroy and
  for desperate exploiters to plunder.  A mix of projects --
  
   - some with explicit locations like these coordinates
   - some with vague lists like 200-300 locations in the continental US
   - some with no presence at all
  
  is best.
  
  Other information to consider including --
  
   - software implementations (the Debian archive and source code)
   - human language references
   - scientific datasets and paper archives
   - scientific source code and reproducibility instructions
   - farming data and scientific methods
   - practical how-to information such as Farmer's Almanac
  
  -andy
 
 Anyone know any dissident billionaires willing to fund such a
 project? Maybe Pierre Omidyar would be interested...

We need a Long Knowledge team.  (Maybe Long Now would be interested.)

Renegate Librarians to help us collate, arrange, choose, and index the
knowledge.

Data storage research into ways to store data for the long term, with
bootstrapping help for from-scratch informationseekers.

Collections Curators to assemble the desired information on a yearly
basis for the next crop of seeds.

Distributed Storage networking for online collaboration on all the
above.

Independent Planters creating their own instantiations of the seeds
(using disparate funding) to store in locations worldwide and off planet
against future disasters.

and other specialties not yet enumerated ...

-andy
-- 
Liberationtech is public  archives are searchable on Google. Violations of 
list guidelines will get you moderated: 
https://mailman.stanford.edu/mailman/listinfo/liberationtech. Unsubscribe, 
change to digest, or change password by emailing moderator at 
compa...@stanford.edu.



Re: [liberationtech] economic cost of lost emails.

2014-08-25 Thread The Doctor
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 08/24/2014 12:40 PM, J.M. Porup wrote:

 Something P2P, maybe blockchain-based, might work. Convincing
 people of the reality and urgency of the threat is another matter.

Making local copies is not ornerous.  It can be as simple as hitting
^s to save a page, or printing it to a file.  Or it can be as complex
as using Scrapbook to make an instant local copy of a page.  It's a
habit that has to be gotten into, but is well worth it.

- -- 
The Doctor [412/724/301/703] [ZS]
Developer, Project Byzantium: http://project-byzantium.org/

PGP: 0x807B17C1 / 7960 1CDC 85C9 0B63 8D9F  DD89 3BD8 FF2B 807B 17C1
WWW: https://drwho.virtadpt.net/

PRESS PLAY ON TAPE

-BEGIN PGP SIGNATURE-

iQIcBAEBCgAGBQJT+6X8AAoJED1np1pUQ8RkiO8P/08s8NnYjS/U9Hc5HOjUxVgr
ol9Qv1LSpY/WMtzvTTY0EFDs8CfrMEQ0Ak8pva/pPuhiH61Fq13E3fPXHHiTwSOe
XKiaejSotsURZvchIcPoaoUIMUZVcNlwUQekIoHKqTYWI3hP9N63uQ8k1UZlkhav
QfjTdlW6pVdMBBBp+aphWkM1dU4LBIOe4pSWv05UCISbKB6MyHUvZi9zcXFZho1c
fhs7N87VD5ZMC/ypSFD5VA1bZxreqfivtEKt/YjoNzMdcRgAAXxyQt/Hs3NroZPX
dqXtx4OXmPbqXdg1nqj3H/S1V+/6oRrSTXTYIuRBNCrIm6xhqDNeB1feEl0cFbR2
w6lsROrwx/LeLvK2cJQA/q4YNHOr5oWZ5b4IawOsgZSK2KH5gpjVJGxWTE/k+iTs
gl13ud3GDh7cvHS2vrse8Ef2/0A0BwgE2i8jv3EH8jBJI8vojDeErEXUDYlr56mh
nf+3HRnfQdIYJ9oEfReYxICWcadn/qv/SxIP3b8BE9vQI0Ufov5uK1g9NMnhD249
9KzO3lvlikddJRbnSYinmi8LwJwjOR8oBIEMbAWuEY23RHXWjUC5/IzcGCd3B0jd
eGatrGPwzBR3TAua9C1MCYBz7ZT+FMOUHHU+gEdP58g9v3Ean8jWb6GYsDItMG8/
ScQ8M/4SuT3aE2LmTbwR
=uK0s
-END PGP SIGNATURE-
-- 
Liberationtech is public  archives are searchable on Google. Violations of 
list guidelines will get you moderated: 
https://mailman.stanford.edu/mailman/listinfo/liberationtech. Unsubscribe, 
change to digest, or change password by emailing moderator at 
compa...@stanford.edu.



Re: [liberationtech] economic cost of lost emails.

2014-08-24 Thread J.M. Porup
On Sun, Aug 24, 2014, at 02:24, grjm wrote:
 On Sun, 17 Aug 2014 17:12:11 + (UTC)
 Troy Benjegerdes ho...@hozed.org wrote:
  At my last 'full-time employee' gig, I was at a company that
  effectively lobotomized themselves with an idiotic data retention
  policy. One test engineer had 20 years of email going nearly back to
  when the company was started, and 'policy' was that it must be
  deleted.
 
 There is a lot of history loss going on, despite backups.  I've had
 personal content suddenly disappear from public services and lost many
 many communications due to filtering.  Plus sudden hardware failures
 most of my personal backed up data is just gone now.
 
 Are there any software projects are out there to resist an eventuality
 of digital book burning?

No. But there should be. Although I fear any such efforts will
probably be futile.

https://www.anamericandissidentinexile.com/blog/2014/01/fahrenheit-72/

Jens

--
J.M. Porup
www.JMPorup.com
-- 
Liberationtech is public  archives are searchable on Google. Violations of 
list guidelines will get you moderated: 
https://mailman.stanford.edu/mailman/listinfo/liberationtech. Unsubscribe, 
change to digest, or change password by emailing moderator at 
compa...@stanford.edu.



Re: [liberationtech] economic cost of lost emails.

2014-08-24 Thread Andrew Lewman
On Sun, Aug 24, 2014 at 05:24:49AM +, g...@i2pmail.org wrote 1.1K bytes in 
0 lines about:
: There is a lot of history loss going on, despite backups.  I've had

Sorry you've learned the hard way about the difference between backups
and archiving. Most of us have learned this the same way.

: Are there any software projects are out there to resist an eventuality
: of digital book burning?

Fine places to start are https://archive.org/about/faqs.php#Archive-It
and http://longserver.org/

Or maybe the NSA or GCHQ has it all. ;)

-- 
Andrew
pgp 0x6B4D6475
-- 
Liberationtech is public  archives are searchable on Google. Violations of 
list guidelines will get you moderated: 
https://mailman.stanford.edu/mailman/listinfo/liberationtech. Unsubscribe, 
change to digest, or change password by emailing moderator at 
compa...@stanford.edu.



Re: [liberationtech] economic cost of lost emails.

2014-08-24 Thread taltman
I don't know exactly what is meant by eventuality of digital book
burning, but here's my opinion on the nuts and bolts of protecting your
data:

Prudent data backup/retention of digital data requires two key concepts:

1. Store data in a system that is self-healing.

In other words, if there is bit rot or other kinds of storage medium
malfunction, will the system detect it and repair the data?
Examples: rsbep, BTRFS and ZFS (Note: not the same as RAID, nor SMART)

http://arstechnica.com/information-technology/2014/01/bitrot-and-atomic-cows-inside-next-gen-filesystems/
 [Search domain users.softlab.ntua.gr]
users.softlab.ntua.gr/~ttsiod/rsbep.html
https://duckduckgo.com/?q=rsbep%20site%3Ausers.softlab.ntua.grhttp://users.softlab.ntua.gr/%7Ettsiod/rsbep.html


2. Store copies of the data in multiple locations

Whether the threat is from earthquakes, fire, hurricane, civil unrest,
theft, or digital book burning, keep copies in multiple secure
locations. I'd recommend having one copy far away from where you live
and work; out of region. Encryption of these data would be a good idea
to give you peace of mind that you are not extending your attack surface
with all of these copies. Of course, then you need a separate backup
system for your encryption keys. :-)

--

The ideal storage medium is a very controversial topic. It seems that
for small operators tape backups are not a good option in terms of cost
and upkeep. Optical discs are much more fragile than what they were
believed to be, and won't last more than ten years (see link below). For
backups, spinning disks seem to be the best bet for now. For archiving,
store the archives in a self-healing system on disk, and keep the disks
offline (i.e., cold storage). You will probably want to spin up the
archive disks at least once every one to two years, to allow for the
self-healing system to do its job, and to detect catastrophic disk
failures (which will happen around year 5 to 7).

http://www.wbur.org/npr/340716269/how-long-do-cds-last-it-depends-but-definitely-not-forever


For items that you truly want to last for decades or even centuries,
print it out using high-quality ink on archival paper. There are
programs to print out documents with error-correcting codes on each
line, which kind of gives you concept #1 from above. Dried 2D pulp
technology has been proven effective based on millenia of testing, as
opposed to our current unreliable digital media.

---

This is of course a gross simplification. I'd be curious to hear other
opinions as well.

Cheers,

~Tomer



On 8/24/14 10:22 AM, Andrew Lewman wrote:
 On Sun, Aug 24, 2014 at 05:24:49AM +, g...@i2pmail.org wrote 1.1K bytes 
 in 0 lines about:
 : There is a lot of history loss going on, despite backups.  I've had

 Sorry you've learned the hard way about the difference between backups
 and archiving. Most of us have learned this the same way.

 : Are there any software projects are out there to resist an eventuality
 : of digital book burning?

 Fine places to start are https://archive.org/about/faqs.php#Archive-It
 and http://longserver.org/

 Or maybe the NSA or GCHQ has it all. ;)


-- 
Liberationtech is public  archives are searchable on Google. Violations of 
list guidelines will get you moderated: 
https://mailman.stanford.edu/mailman/listinfo/liberationtech. Unsubscribe, 
change to digest, or change password by emailing moderator at 
compa...@stanford.edu.

Re: [liberationtech] economic cost of lost emails.

2014-08-24 Thread J.M. Porup
On Sun, Aug 24, 2014, at 15:19, taltman wrote:
 I don't know exactly what is meant by eventuality of digital book
 burning, but here's my opinion on the nuts and bolts of protecting your
 data:

I believe we are approaching a Library of Alexandria moment. We have 
created an Information Age in which nothing is secure, and deleting 
unwanted information (thought crime) is trivial. Furthermore, infotech 
has redistributed power from the people to the government. It would be
naive to expect this power to go unabused. Totalitarianism is in
the wind.

If we really want a permanent archive of humanity's work, we 
need to build some kind of distributed Noah's Ark. Archive.org is
no good (book depositories are the first to go when the book-burning
starts), and asking the book-burners at the NSA and GCHQ to guard
our civilization's store of knowledge is laughable on its face.

Something P2P, maybe blockchain-based, might work. Convincing people
of the reality and urgency of the threat is another matter.

Jens

--
J.M. Porup
www.JMPorup.com
-- 
Liberationtech is public  archives are searchable on Google. Violations of 
list guidelines will get you moderated: 
https://mailman.stanford.edu/mailman/listinfo/liberationtech. Unsubscribe, 
change to digest, or change password by emailing moderator at 
compa...@stanford.edu.



Re: [liberationtech] economic cost of lost emails.

2014-08-24 Thread taltman
Everything online is ephemeral. Just look at studies on link rot:

http://www.gwern.net/Archiving%20URLs

For storing the totality of humanity's work, we need to design something
more like the Svalbard Global Seed Vault:

https://en.wikipedia.org/wiki/Svalbard_Global_Seed_Vault

My $0.02,

~T


On 8/24/14 12:40 PM, J.M. Porup wrote:
 On Sun, Aug 24, 2014, at 15:19, taltman wrote:
 I don't know exactly what is meant by eventuality of digital book
 burning, but here's my opinion on the nuts and bolts of protecting your
 data:
 I believe we are approaching a Library of Alexandria moment. We have 
 created an Information Age in which nothing is secure, and deleting 
 unwanted information (thought crime) is trivial. Furthermore, infotech 
 has redistributed power from the people to the government. It would be
 naive to expect this power to go unabused. Totalitarianism is in
 the wind.

 If we really want a permanent archive of humanity's work, we 
 need to build some kind of distributed Noah's Ark. Archive.org is
 no good (book depositories are the first to go when the book-burning
 starts), and asking the book-burners at the NSA and GCHQ to guard
 our civilization's store of knowledge is laughable on its face.

 Something P2P, maybe blockchain-based, might work. Convincing people
 of the reality and urgency of the threat is another matter.

 Jens

 --
 J.M. Porup
 www.JMPorup.com


-- 
Liberationtech is public  archives are searchable on Google. Violations of 
list guidelines will get you moderated: 
https://mailman.stanford.edu/mailman/listinfo/liberationtech. Unsubscribe, 
change to digest, or change password by emailing moderator at 
compa...@stanford.edu.



Re: [liberationtech] economic cost of lost emails.

2014-08-24 Thread Al Billings

On Aug 24, 2014, at 1:20 PM, taltman taltm...@stanford.edu wrote:

 Everything online is ephemeral. Just look at studies on link rot:
 
 http://www.gwern.net/Archiving%20URLs
 
 For storing the totality of humanity's work, we need to design something
 more like the Svalbard Global Seed Vault:
 
 https://en.wikipedia.org/wiki/Svalbard_Global_Seed_Vault

Someone explain to me why I’d *want* my emails stored until the end of time. 
I’d rather they rot and disappear if I made no effort to keep them.

That said, the NSA has a pretty good archive.

Al
-- 
Liberationtech is public  archives are searchable on Google. Violations of 
list guidelines will get you moderated: 
https://mailman.stanford.edu/mailman/listinfo/liberationtech. Unsubscribe, 
change to digest, or change password by emailing moderator at 
compa...@stanford.edu.



Re: [liberationtech] economic cost of lost emails.

2014-08-24 Thread Natanael
A blockchain of torrent magnet links, of archives of all kinds of data like
everything public that Archive.org holds?
Then you both have it all accessible and you can that verify everybody sees
the same version.

I've been thinking of a sci-fi story concept of archivers collecting and
indexing absolutely everything that matters in a structured append-only
database of sorts (side story, but necessary in my sci-fi world).
Everything would be tagged and sorted and categorized and annotated. It
would be like a P2P Git with more metadata and the ability to search with
all sorts of filters, essentially an open Google/Wolfram Alpha given a
smart enough endpoint, with a bit of IBM Watson. There would be plenty of
separate projects all maintaining their own archives, of which some would
be thoroughly vetted for authencity, and all updates ever would be signed
by the contributors/archivers.

Kind of Wikipedia actually, but with all sorts of filetypes and a full
semantic web, with the hash chain structure of which Git and Bitcoin share
to prove the history of the data, and digital signatures.

It would already be possible to build today (it doesn't need any new exotic
algorithms or other inventions), but designing it can be incredibly hard
considering you'd have to figure out a standard way to handle
cross-referencing and annotation across all kinds of filetypes, and that
you need to define a data structure that won't need to be replaced every
few months because of frequently discovered limitations.

- Sent from my phone
Den 24 aug 2014 21:40 skrev J.M. Porup j...@porup.com:

 On Sun, Aug 24, 2014, at 15:19, taltman wrote:
  I don't know exactly what is meant by eventuality of digital book
  burning, but here's my opinion on the nuts and bolts of protecting your
  data:

 I believe we are approaching a Library of Alexandria moment. We have
 created an Information Age in which nothing is secure, and deleting
 unwanted information (thought crime) is trivial. Furthermore, infotech
 has redistributed power from the people to the government. It would be
 naive to expect this power to go unabused. Totalitarianism is in
 the wind.

 If we really want a permanent archive of humanity's work, we
 need to build some kind of distributed Noah's Ark. Archive.org is
 no good (book depositories are the first to go when the book-burning
 starts), and asking the book-burners at the NSA and GCHQ to guard
 our civilization's store of knowledge is laughable on its face.

 Something P2P, maybe blockchain-based, might work. Convincing people
 of the reality and urgency of the threat is another matter.

 Jens

 --
 J.M. Porup
 www.JMPorup.com
 --
 Liberationtech is public  archives are searchable on Google. Violations
 of list guidelines will get you moderated:
 https://mailman.stanford.edu/mailman/listinfo/liberationtech.
 Unsubscribe, change to digest, or change password by emailing moderator at
 compa...@stanford.edu.


-- 
Liberationtech is public  archives are searchable on Google. Violations of 
list guidelines will get you moderated: 
https://mailman.stanford.edu/mailman/listinfo/liberationtech. Unsubscribe, 
change to digest, or change password by emailing moderator at 
compa...@stanford.edu.

Re: [liberationtech] economic cost of lost emails.

2014-08-23 Thread grjm
On Sun, 17 Aug 2014 17:12:11 + (UTC)
Troy Benjegerdes ho...@hozed.org wrote:
 At my last 'full-time employee' gig, I was at a company that
 effectively lobotomized themselves with an idiotic data retention
 policy. One test engineer had 20 years of email going nearly back to
 when the company was started, and 'policy' was that it must be
 deleted.

There is a lot of history loss going on, despite backups.  I've had
personal content suddenly disappear from public services and lost many
many communications due to filtering.  Plus sudden hardware failures
most of my personal backed up data is just gone now.

Are there any software projects are out there to resist an eventuality
of digital book burning?

Personal knowledge and public knowledge have both fallen prey to
targetted attacks.  Wikipedia sure lacks decentralization and a web of
revision trust.
-- 
Liberationtech is public  archives are searchable on Google. Violations of 
list guidelines will get you moderated: 
https://mailman.stanford.edu/mailman/listinfo/liberationtech. Unsubscribe, 
change to digest, or change password by emailing moderator at 
compa...@stanford.edu.