Re: scanning a ton of documentation

2021-09-24 Thread Michael Mulhern via cctalk
Confirming IA will derive off PDF as well as CBZ to generate OCRed PDFs.

//m

On Fri, 24 Sep 2021 at 7:33 pm, Christian Corti via cctalk <
cctalk@classiccmp.org> wrote:

> On Wed, 22 Sep 2021, Al Kossow wrote:
> > On 9/22/21 1:51 PM, Christian Corti via cctalk wrote:
> >> Hasn't worked for me in the past ...
> > guess I picked a bad day to stop sniffing glue
>
> Don't get me wrong, but I had written some emails in the last years
> offering stuff for bitsavers (downloadable from our FTP site). They never
> made it into the archives.
>
> Christian
>
-- 


*Blog: RetroRetrospective – Fun today with yesterday's gear……..
*
*Podcast*: *Retro Computing Roundtable * (Co-Host)


Re: scanning a ton of documentation

2021-09-24 Thread Al Kossow via cctalk

On 9/24/21 2:32 AM, Christian Corti via cctalk wrote:

Don't get me wrong, but I had written some emails in the last years offering stuff for bitsavers (downloadable from our FTP site). They 
never made it into the archives.


Christian


I know I've put some things up. Sometimes offers get lost in my email stream.



Re: scanning a ton of documentation

2021-09-24 Thread Christian Corti via cctalk

On Wed, 22 Sep 2021, Al Kossow wrote:

On 9/22/21 1:51 PM, Christian Corti via cctalk wrote:

Hasn't worked for me in the past ...

guess I picked a bad day to stop sniffing glue


Don't get me wrong, but I had written some emails in the last years 
offering stuff for bitsavers (downloadable from our FTP site). They never 
made it into the archives.


Christian


Re: scanning a ton of documentation

2021-09-24 Thread Christian Corti via cctalk

On Wed, 22 Sep 2021, Al Kossow wrote:

Bitsavers will post process and create a searchable PDF


Since when?


I think I'll just step away from the terminal for a few hours.
I've been OCRing uploads for YEARS.


Ok, I didn't know that you do that for foreign scans, I thought you only 
OCR your own scans.


Christian


Re: scanning a ton of documentation

2021-09-22 Thread geneb via cctalk

On Wed, 22 Sep 2021, Christian Corti via cctalk wrote:


On Wed, 22 Sep 2021, Jay Jaeger wrote:

B/W, CCITT Group 4 tiffs at 400dpi is what I do, but then I also


600 DPI should be the absolute mininum today. There is absolutely no reason 
to go below that for B/W.



Agreed.


Bitsavers will post process and create a searchable PDF


Since when?



No idea, but the IA will process the upload into html, plain text, mobi, 
and OCRd PDF.


If I'm scanning bound books, those end up as indiviual tiff images (one 
per page).  At the end of processing those, they get stuffed into a zip 
file with the suffix "cbz" (Comic Book Zip) and once uploaded, the derive 
task at the IA handles all the OCR work as well as creating those other 
formats.  I'm pretty sure it will do the same with uploaded PDF files.


g.

--
Proud owner of F-15C 80-0007
http://www.f15sim.com - The only one of its kind.
http://www.diy-cockpits.org/coll - Go Collimated or Go Home.
Some people collect things for a hobby.  Geeks collect hobbies.

ScarletDME - The red hot Data Management Environment
A Multi-Value database for the masses, not the classes.
http://scarlet.deltasoft.com - Get it _today_!


Re: scanning a ton of documentation

2021-09-22 Thread geneb via cctalk

On Wed, 22 Sep 2021, devin davison via cctalk wrote:


I have picked up storage containers for all the books, and i can scan it
all. after that, its all probally going in the recycle bin, as i dont know
where or how i would keep such a large pile of paper manuals on hand.

what is the prefered format to upload things to bitsavers in? is pdf
acceptable?

I would scan it all at 600dpi - that's what I do with everything I scan. 
I'd upload it to the IA as well, but that's just me. :)



How can i create a pdf that is not too big on file size? Can the text be
recognized  and be made searchable within the scanned pdf?


Disk space is cheap, go for scan quality.

g.

--
Proud owner of F-15C 80-0007
http://www.f15sim.com - The only one of its kind.
http://www.diy-cockpits.org/coll - Go Collimated or Go Home.
Some people collect things for a hobby.  Geeks collect hobbies.

ScarletDME - The red hot Data Management Environment
A Multi-Value database for the masses, not the classes.
http://scarlet.deltasoft.com - Get it _today_!


Re: scanning a ton of documentation

2021-09-22 Thread Al Kossow via cctalk

On 9/22/21 1:51 PM, Christian Corti via cctalk wrote:


Bitsavers will post process and create a searchable PDF


Since when?


I think I'll just step away from the terminal for a few hours.
I've been OCRing uploads for YEARS.



Re: scanning a ton of documentation

2021-09-22 Thread Al Kossow via cctalk

On 9/22/21 1:51 PM, Christian Corti via cctalk wrote:


Hasn't worked for me in the past ...


guess I picked a bad day to stop sniffing glue



Re: scanning a ton of documentation

2021-09-22 Thread Christian Corti via cctalk

On Wed, 22 Sep 2021, Jay Jaeger wrote:

B/W, CCITT Group 4 tiffs at 400dpi is what I do, but then I also


600 DPI should be the absolute mininum today. There is absolutely no 
reason to go below that for B/W.



and notify Al Kossow of an available contribution.


Hasn't worked for me in the past ...


Bitsavers will post process and create a searchable PDF


Since when?

Christian


Re: scanning a ton of documentation

2021-09-22 Thread devin davison via cctalk
Thank you all for the information. I would like to preserve the information
and get the books to good homes if possible. I will get a listing of what
is here. If the others that have modcomp books and docs can list what they
have scanned and availible, that can save on a duplication of effort
scanning something that is already online.

I am Located in melbourne FL.

thanks,

Devin D.

On Wed, Sep 22, 2021, 3:30 PM Lyle Bickley  wrote:

> AFAIK, I have an entire library of MODCOMP documentation - consisting of
> over
> one thousand manuals as PDF's. I will check with the person who had the
> original manuals scanned and check to see if it is O.K. to zip them up and
> give
> them to AL to post on bitsavers. (I expect them to say "O.K.").
>
> Cheers,
> Lyle
> --
> On Wed, 22 Sep 2021 15:14:02 -0400
> Bill Degnan via cctalk  wrote:
>
> > Where are you located?  We have a small amount of Modcomp docs here and
> > could take on a box more of the most useful paper docs for
> posterity-sake,
> > to round out what's already here.
> > Thanks
> > Bill Degnan
> > kennettclassic.com
> > Kennett Square, PA
> >
> > On Wed, Sep 22, 2021 at 3:05 PM devin davison via cctalk <
> > cctalk@classiccmp.org> wrote:
> >
> > > Hello,
> > >
> > > The person that refered me to my present job  at a datacenter passed
> away
> > > this past monday. He was a hardware / software engineer for modcomp
> > > computers. He left me all of the computers and documents. there are too
> > > many books to keep, stuff concerning the modcomp computers that is not
> > > saved anywhere else that i can tell.
> > >
> > > I have picked up storage containers for all the books, and i can scan
> it
> > > all. after that, its all probally going in the recycle bin, as i dont
> know
> > > where or how i would keep such a large pile of paper manuals on hand.
> > >
> > > what is the prefered format to upload things to bitsavers in? is pdf
> > > acceptable?
> > >
> > > How can i create a pdf that is not too big on file size? Can the text
> be
> > > recognized  and be made searchable within the scanned pdf?
> > >
> > > any input would be appreciated, Thanks.
> > >
> > > --Devin D.
> > >
>
>
>
> --
> 73   NM6Y
> Bickley Consulting West
> https://bickleywest.com
>
> "Black holes are where God is dividing by zero"
>


Re: scanning a ton of documentation

2021-09-22 Thread Lyle Bickley via cctalk
AFAIK, I have an entire library of MODCOMP documentation - consisting of over
one thousand manuals as PDF's. I will check with the person who had the
original manuals scanned and check to see if it is O.K. to zip them up and give
them to AL to post on bitsavers. (I expect them to say "O.K.").

Cheers,
Lyle
--
On Wed, 22 Sep 2021 15:14:02 -0400
Bill Degnan via cctalk  wrote:

> Where are you located?  We have a small amount of Modcomp docs here and
> could take on a box more of the most useful paper docs for posterity-sake,
> to round out what's already here.
> Thanks
> Bill Degnan
> kennettclassic.com
> Kennett Square, PA
> 
> On Wed, Sep 22, 2021 at 3:05 PM devin davison via cctalk <
> cctalk@classiccmp.org> wrote:  
> 
> > Hello,
> >
> > The person that refered me to my present job  at a datacenter passed away
> > this past monday. He was a hardware / software engineer for modcomp
> > computers. He left me all of the computers and documents. there are too
> > many books to keep, stuff concerning the modcomp computers that is not
> > saved anywhere else that i can tell.
> >
> > I have picked up storage containers for all the books, and i can scan it
> > all. after that, its all probally going in the recycle bin, as i dont know
> > where or how i would keep such a large pile of paper manuals on hand.
> >
> > what is the prefered format to upload things to bitsavers in? is pdf
> > acceptable?
> >
> > How can i create a pdf that is not too big on file size? Can the text be
> > recognized  and be made searchable within the scanned pdf?
> >
> > any input would be appreciated, Thanks.
> >
> > --Devin D.
> >  



-- 
73   NM6Y
Bickley Consulting West
https://bickleywest.com

"Black holes are where God is dividing by zero"


Re: scanning a ton of documentation

2021-09-22 Thread Jay Jaeger via cctalk


> On Sep 22, 2021, at 14:05, devin davison via cctalk  
> wrote:
> 
> Hello,
> 
> The person that refered me to my present job  at a datacenter passed away
> this past monday. He was a hardware / software engineer for modcomp
> computers. He left me all of the computers and documents. there are too
> many books to keep, stuff concerning the modcomp computers that is not
> saved anywhere else that i can tell.
> 
> I have picked up storage containers for all the books, and i can scan it
> all. after that, its all probally going in the recycle bin, as i dont know
> where or how i would keep such a large pile of paper manuals on hand.
> 
> what is the prefered format to upload things to bitsavers in? is pdf
> acceptable?

If you go to bitsavers.org there is a short contributor’s guide.  For B/W, 
CCITT Group 4 tiffs at 400dpi is what I do, but then I also generate a PDF from 
them for my own use, and put both on my Google drive and notify Al Kossow of an 
available contribution.

> How can i create a pdf that is not too big on file size? Can the text be
> recognized  and be made searchable within the scanned pdf?

Bitsavers will post process and create a searchable PDF

> 
> any input would be appreciated, Thanks.
> 
> --Devin D.



Re: scanning a ton of documentation

2021-09-22 Thread Bill Degnan via cctalk
Where are you located?  We have a small amount of Modcomp docs here and
could take on a box more of the most useful paper docs for posterity-sake,
to round out what's already here.
Thanks
Bill Degnan
kennettclassic.com
Kennett Square, PA

On Wed, Sep 22, 2021 at 3:05 PM devin davison via cctalk <
cctalk@classiccmp.org> wrote:

> Hello,
>
> The person that refered me to my present job  at a datacenter passed away
> this past monday. He was a hardware / software engineer for modcomp
> computers. He left me all of the computers and documents. there are too
> many books to keep, stuff concerning the modcomp computers that is not
> saved anywhere else that i can tell.
>
> I have picked up storage containers for all the books, and i can scan it
> all. after that, its all probally going in the recycle bin, as i dont know
> where or how i would keep such a large pile of paper manuals on hand.
>
> what is the prefered format to upload things to bitsavers in? is pdf
> acceptable?
>
> How can i create a pdf that is not too big on file size? Can the text be
> recognized  and be made searchable within the scanned pdf?
>
> any input would be appreciated, Thanks.
>
> --Devin D.
>