3 cubic meters is about 60-90 book boxes of the size the mover gave us
for our books the last time we moved.
If you are going to do it yourself, over a long period of time, I hope,
I recommend the ScanSnap ix500 scanner. It scans about 25 pages (50
sides, since it scans both sides simultaneously) per minute, with a 50
sheet feeder and fairly intelligent detection of double feeds and blank
sides.You have to be careful to check for dust buildup. The software
with it is pretty good also.
For processing, classifying, and storing the files, I recommend
DevonThink Pro Office if you have a Mac. It has some intelligence built
in to it to determine similar content in different documents, and this
supports auto-classification and "see also" functionality. I confess I
haven't really given that part of it a test. The OCR of your documents
can be done by ScanSnap or by DevonThink. DevonThink does not do data
lock-in. Your documents will be files in the OS, but can be stored
optionally in the DT 'database' which is just a bundle of files with
indexes.
There are commercial scanning services, but I've never checked out their
prices. If you scan them yourself, you will probably end up hating
staples as much as I do. They can go through the scanner easily and
harmlessly, but if they attach 2 or more sheets, you'll have to unjam
the document feed. Booklets are no problem if you can take the pages
apart. Books are no problems if you are happy bandsawing the spine off.
And it is very satisfying having everything on a hard drive, fully
backed up, fully indexed. Or so I believe -- I haven't gotten through my
stack yet.
--Barry
On 1 May 2016, at 22:34, Arlo Barnes wrote:
We have talked a little on this list about related topics, but I
figured I
would ask people's opinions outright.
I have about 3 cubic meters of assorted paper documents -- and by
assorted
I mean both unsorted into categories, but also of various types.
For example, there are papers that are unimportant that should be set
aside
for disposal. There are papers of mild interest that should be kept if
possible (in a digital form, as their physical presence has no value
beyond
the contained information, and negative value in space taken up and
mental
clutter added). There are documents that should be digitized, but
cannot be
disposed of as their physical form is important to their existence
(certificates for instance). Some of the information in the documents
is
sensitive, and since it is mixed in, the whole pile should be treated
as
such (although there is not nothing that could not be shown to a
well-trusted entity). And the papers are not all of the same size or
stock;
some of them are loose, some pamphlets, brochures, or even slim books.
Once they are digitized they will also need to be semanticized and
related
to one another to start to make sense of it.
So, how should I go about this? Would mechanisation of some form help?
Can
this even reasonably be done by one person?
-Arlo James Barnes
============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com