Re: PDF storage software recommendations?

2010-06-21 Thread Michael Grünewald

Hi all,

Polytropon wrote:


On Fri, 18 Jun 2010 19:24:52 -0700 (PDT), Bill Tillmanbtillma...@yahoo.com  
wrote:

But put me
down for a vote on this method using simple text files and awk.


It JUST WORKS - that's the goal. It can be developed and configured
very fast, can easily be extended (or limited), and data is stored
in a STANDARD (!!!) format which allows you to do ANYTHING with
it. You can even provide a web-driven interface for the database,
even that is possible.


I use for my work a solution matching Polytopon's suggestion, it sounds 
very to natural to the UNIX user in me :)  I am a scientist and have to 
daily deal with an increasing amount of electronic papers, made 
available in PDF, DJVU, PostScript or even Tiff.  I organised my library 
so that each document get its own directory. Each directory then 
contains the document file(s) per itself and meta information,  stored 
along in files whose names are fixed (for instance +INDEX for general 
information, +BIBTEX for BibTeX fata, etc.).


I only need a couple of hours to write a program easing the addition of 
a document to the library and another one generating a HTML index out of 
the meta informations, and while my system is far from perfect, it 
exists, and helps me every day.


I also had to help colleagues in various ways with their computer, 
sometimes giving them some (seemingly) very unfriendly scripts I wrote. 
My experience with this, is that, provided I show these people how it 
works and supervise their first steps with the program, they can 
actually use it and like it, despite the fact that the experience 
offered by the program is at first not as nice to them as the one 
offered by a GUI.


However, being a scientist, I would not consider my working environment 
as `standard', whatever it means!



We have a Windows based system at my current job which uses
FileMaker Pro. It's amazing what we can do with this and it's
like having a gigantic electronic filing cabinet.


Oh, the paperless office... an utopia - at least in Germany,
bureaucracy's home country. :-)


I thought France was that :)  Rules are sometimes changing so often that 
administrative staff does not always has time to catch up them all! 
Nevertheless all of this bureaucracy is sometimes very useful too---but 
it is always a bit annoying ;)


Cheers,
Michael
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: PDF storage software recommendations?

2010-06-19 Thread Polytropon
On Fri, 18 Jun 2010 19:24:52 -0700 (PDT), Bill Tillman btillma...@yahoo.com 
wrote:
 
 Date: Thu, 17 Jun 2010 23:31:15 -0700
 From: Charlie Kester corky1...@comcast.net
 Subject: Re: PDF storage software recommendations?
 To: freebsd-questions@freebsd.org
 Message-ID: 20100618063115.ga57...@comcast.net
 Content-Type: text/plain; charset=us-ascii; format=flowed
 
 On Thu 17 Jun 2010 at 19:57:03 PDT Polytropon wrote:
 
 Maybe my answer will sound low level, but it works - REALLY works -
 and works with mostly every kind of data.
 
 It's good to see someone recommending a true Unix-style solution.  :)
 
 Here, here.  I too love simple text files. With the speed of today's
 computers it's not impractical to use text files.

That's basically what you have computer for - to make work faster,
not slower. :-)



 And something like you suggest with awk I think would workexcept
 for one major thing. When building a database like this you usually
 have to build an interface that normal users will work with.

That's why I suggested building a shell + Tcl/Tk script around it
for the various database operations you can perform with it. The
idea is that it is customizable ad infinitum, because everything is
programmable into deepest details.



 And something that I could use versus something the other people
 in the office could use are often worlds apart. I once wrote a
 program to do linear optimization for cutting metal parts from
 stock lengths. For me it was a simple block of code about 30-40
 lines as I recall. The other guys in the warehouse saw it and
 told the boss they wanted it too. He then instructed me to expand
 it so the common users could work with it. Well 2 months later
 and about another 400 lines of code to make it user friendly we
 finally had something. So as I see it the interface for other
 not so tech-savvy users will be the trouble with this approach.

This sounds familiar. :-) I've also walked this way for average
users, at this time, by choice was to create a GUI control program
using C with Gtk. Today, I would consider that totally overhead.
My average users were psychiatrists, so any assumption about
intelligency would not match the reality. :-)

You wonder how people got their work done on 80x25 in a wrong
language 30 years ago...


 But put me
 down for a vote on this method using simple text files and awk.

It JUST WORKS - that's the goal. It can be developed and configured
very fast, can easily be extended (or limited), and data is stored
in a STANDARD (!!!) format which allows you to do ANYTHING with
it. You can even provide a web-driven interface for the database,
even that is possible.



 We have a Windows based system at my current job which uses
 FileMaker Pro. It's amazing what we can do with this and it's
 like having a gigantic electronic filing cabinet.

Oh, the paperless office... an utopia - at least in Germany,
bureaucracy's home country. :-)

Anyway, relying on a Windows program is, in my opinion, not the
best choice for a long-term project such as document filing. With
the constant transitions in the underlyíng OS, and the immense
costs, as well as the lock-in driven by closed (non-standard)
formats, and finally through the limitation of what the original
program developers did provide, makes me wonder if this can really
be useful for longer times (let's say, +20 years - where the low
level solution does still work).



 It's pricey and it took the IT guys some time to build it but
 it does do some fantastic things in keeping tons of files organized,
 indexed and searchable. But I'd like to try my hand at building
 something with text files and awk.

Just imagine about 20 years in the future, and you'll see what's
the better solution. :-)



-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: PDF storage software recommendations?

2010-06-18 Thread Charlie Kester

On Thu 17 Jun 2010 at 19:57:03 PDT Polytropon wrote:


Maybe my answer will sound low level, but it works - REALLY works -
and works with mostly every kind of data.


It's good to see someone recommending a true Unix-style solution.  :)
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: PDF storage software recommendations?

2010-06-18 Thread Andrea Venturoli

On 06/17/10 22:54, Dale Scott wrote:


I'm experimenting with OpenDocMan (PHP/MySQL, http://www.opendocman.com/)


We evaluated OpenDocMan (not me personally) and ended up choosing 
KnowledgeTree. YMMV.


 bye
av.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: PDF storage software recommendations?

2010-06-18 Thread Bill Tillman

Date: Thu, 17 Jun 2010 23:31:15 -0700
From: Charlie Kester corky1...@comcast.net
Subject: Re: PDF storage software recommendations?
To: freebsd-questions@freebsd.org
Message-ID: 20100618063115.ga57...@comcast.net
Content-Type: text/plain; charset=us-ascii; format=flowed

On Thu 17 Jun 2010 at 19:57:03 PDT Polytropon wrote:

Maybe my answer will sound low level, but it works - REALLY works -
and works with mostly every kind of data.

It's good to see someone recommending a true Unix-style solution.  :)

Here, here.  I too love simple text files. With the speed of today's computers 
it's not impractical to use text files. And something like you suggest with awk 
I think would workexcept for one major thing. When building a database like 
this you usually have to build an interface that normal users will work with. 
And something that I could use versus something the other people in the office 
could use are often worlds apart. I once wrote a program to do linear 
optimization for cutting metal parts from stock lengths. For me it was a simple 
block of code about 30-40 lines as I recall. The other guys in the warehouse 
saw it and told the boss they wanted it too. He then instructed me to expand it 
so the common users could work with it. Well 2 months later and about another 
400 lines of code to make it user friendly we finally had something. So as I 
see it the interface for other not so tech-savvy users will be the trouble 
with this approach. But put me
 down for a vote on this method using simple text files and awk.
 
We have a Windows based system at my current job which uses FileMaker Pro. It's 
amazing what we can do with this and it's like having a gigantic electronic 
filing cabinet. It's pricey and it took the IT guys some time to build it but 
it does do some fantastic things in keeping tons of files organized, indexed 
and searchable. But I'd like to try my hand at building something with text 
files and awk.



___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


PDF storage software recommendations?

2010-06-17 Thread Michael W. Lucas
Hi,

I have to store a bunch of PDFs of orders.  I'd like to be able to
tag these by customer, date, and a couple of other characteristics,
and then search and/or sort by these tags.

I'm certain that we have something in ports that will do this, but
danged if I can find a good candidate.  While I'm sure I could build a
database/PHP app that would work, surely someone's already done this?
Any recommendations?

Thanks,
==ml

-- 
Michael W. Lucasmwlu...@blackhelicopters.org
http://www.MichaelWLucas.com/, http://blather.MichaelWLucas.com/
New book:  Network Flow Analysis
pre-order now!  http://www.networkflowanalysis.com/
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: PDF storage software recommendations?

2010-06-17 Thread Greg Larkin
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Michael W. Lucas wrote:
 Hi,
 
 I have to store a bunch of PDFs of orders.  I'd like to be able to
 tag these by customer, date, and a couple of other characteristics,
 and then search and/or sort by these tags.
 
 I'm certain that we have something in ports that will do this, but
 danged if I can find a good candidate.  While I'm sure I could build a
 database/PHP app that would work, surely someone's already done this?
 Any recommendations?
 
 Thanks,
 ==ml
 

Hi Michael,

I maintain print/pdftk, and you can edit document metadata with it.  The
updateinfo subcommand should do what you want.  I found this page
describing that and some other functions of the tool:
http://scottnesbitt.net/ubuntublog/?p=269.

The only issue with pdftk right now is that it doesn't install on 6.x
due to problems with the underlying gcc Java toolchain.

Hope that helps,
Greg
- --
Greg Larkin

http://www.FreeBSD.org/   - The Power To Serve
http://www.sourcehosting.net/ - Ready. Set. Code.
http://twitter.com/sourcehosting/ - Follow me, follow you
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFMGldt0sRouByUApARAlB+AJ9SJoUpImsBVht8p2vAtjdDEk3BXQCgvtt+
9gFIox7mxi6i6s/hCSAs9oo=
=6Ll/
-END PGP SIGNATURE-

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: PDF storage software recommendations?

2010-06-17 Thread Michael W. Lucas
On Thu, Jun 17, 2010 at 01:12:13PM -0400, Greg Larkin wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Michael W. Lucas wrote:
  Hi,
  
  I have to store a bunch of PDFs of orders.  I'd like to be able to
  tag these by customer, date, and a couple of other characteristics,
  and then search and/or sort by these tags.
  
  I'm certain that we have something in ports that will do this, but
  danged if I can find a good candidate.  While I'm sure I could build a
  database/PHP app that would work, surely someone's already done this?
  Any recommendations?
  
  Thanks,
  ==ml
  
 
 Hi Michael,
 
 I maintain print/pdftk, and you can edit document metadata with it.  The
 updateinfo subcommand should do what you want.  I found this page
 describing that and some other functions of the tool:
 http://scottnesbitt.net/ubuntublog/?p=269.

That looks like a fabulous tool, actually.  I've wanted that
functionality for years.  But it's not quite what I want.

We get orders for services via PDF.  We need to keep them, and call
them up months or years later.  We'd need to find things like all of
the PDFs for Customer X or all of the PDFs for circuit ID
such-and-such.  Surely other people have had this problem, for
generic documents/files if not PDFs in particular...

==ml

-- 
Michael W. Lucasmwlu...@blackhelicopters.org
http://www.MichaelWLucas.com/, http://blather.MichaelWLucas.com/
New book:  Network Flow Analysis
pre-order now!  http://www.networkflowanalysis.com/
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: PDF storage software recommendations?

2010-06-17 Thread Roland Smith
On Thu, Jun 17, 2010 at 01:12:13PM -0400, Greg Larkin wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Michael W. Lucas wrote:
  Hi,
  
  I have to store a bunch of PDFs of orders.  I'd like to be able to
  tag these by customer, date, and a couple of other characteristics,
  and then search and/or sort by these tags.
  
  I'm certain that we have something in ports that will do this, but
  danged if I can find a good candidate.  While I'm sure I could build a
  database/PHP app that would work, surely someone's already done this?
  Any recommendations?

Keep it simple. Rename the pdf files so that their names encode the data you
want. Then find(1) will do most of what you want. 

If the pdf files contain text instead of scanned images, you could probably do
the renaming automatically with the help of pdftotext(1) from the
'poppler-utils' port and your favorite scripting language.

Put them in sub-directories e.g. by year or even year/month if you've got lots.

Roland
-- 
R.F.Smith   http://www.xs4all.nl/~rsmith/
[plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)


pgpZtuqtfdnun.pgp
Description: PGP signature


Re: PDF storage software recommendations?

2010-06-17 Thread Svein Skogen (Listmail Account)
On 17.06.2010 20:55, Michael W. Lucas wrote:
 We get orders for services via PDF.  We need to keep them, and call
 them up months or years later.  We'd need to find things like all of
 the PDFs for Customer X or all of the PDFs for circuit ID
 such-and-such.  Surely other people have had this problem, for
 generic documents/files if not PDFs in particular...

Sounds pretty much like a database and a filestore. Database to store
all the metadata, with pointers to some machine-readable filenames for
the filestore. I seem to remember that one of my previous employers
hired some code-for-hire guys from UK setting that up (and alas bringing
Oracle salespeople inside the premises. I swear, those guys are harder
to remove than cockroaches...), but I'm sure some of the more
SQL-friendly guys than me could codify something for Postgres and give
it a nice frontend. ;)

//Svein

-- 
+---+---
  /\   |Svein Skogen   | sv...@d80.iso100.no
  \ /   |Solberg Østli 9| PGP Key:  0xE5E76831
   X|2020 Skedsmokorset | sv...@jernhuset.no
  / \   |Norway | PGP Key:  0xCE96CE13
|   | sv...@stillbilde.net
 ascii  |   | PGP Key:  0x58CD33B6
 ribbon |System Admin   | svein-listm...@stillbilde.net
Campaign|stillbilde.net | PGP Key:  0x22D494A4
+---+---
|msn messenger: | Mobile Phone: +47 907 03 575
|sv...@jernhuset.no | RIPE handle:SS16503-RIPE
+---+---
 If you really are in a hurry, mail me at
   svein-mob...@stillbilde.net
 This mailbox goes directly to my cellphone and is checked
even when I'm not in front of my computer.

 Picture Gallery:
  https://gallery.stillbilde.net/v/svein/




signature.asc
Description: OpenPGP digital signature


Re: PDF storage software recommendations?

2010-06-17 Thread Elliot Finley
http://www.google.com/search?hl=enq=open+source+document+management+systemaq=faqi=aql=oq=gs_rfai=

On Thu, Jun 17, 2010 at 12:55 PM, Michael W. Lucas 
mwlu...@blackhelicopters.org wrote:

 On Thu, Jun 17, 2010 at 01:12:13PM -0400, Greg Larkin wrote:
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
 
  Michael W. Lucas wrote:
   Hi,
  
   I have to store a bunch of PDFs of orders.  I'd like to be able to
   tag these by customer, date, and a couple of other characteristics,
   and then search and/or sort by these tags.
  
   I'm certain that we have something in ports that will do this, but
   danged if I can find a good candidate.  While I'm sure I could build a
   database/PHP app that would work, surely someone's already done this?
   Any recommendations?
  
   Thanks,
   ==ml
  
 
  Hi Michael,
 
  I maintain print/pdftk, and you can edit document metadata with it.  The
  updateinfo subcommand should do what you want.  I found this page
  describing that and some other functions of the tool:
  http://scottnesbitt.net/ubuntublog/?p=269.

 That looks like a fabulous tool, actually.  I've wanted that
 functionality for years.  But it's not quite what I want.

 We get orders for services via PDF.  We need to keep them, and call
 them up months or years later.  We'd need to find things like all of
 the PDFs for Customer X or all of the PDFs for circuit ID
 such-and-such.  Surely other people have had this problem, for
 generic documents/files if not PDFs in particular...

 ==ml

 --
 Michael W. Lucasmwlu...@blackhelicopters.org
 http://www.MichaelWLucas.com/, http://blather.MichaelWLucas.com/
 New book:  Network Flow Analysis
 pre-order now!  http://www.networkflowanalysis.com/
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to 
 freebsd-questions-unsubscr...@freebsd.org

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: PDF storage software recommendations?

2010-06-17 Thread Dale Scott
 I'm certain that we have something in ports that will do this, but
 danged if I can find a good candidate.  While I'm sure I 
 could build a
 database/PHP app that would work, surely someone's already done this?
 Any recommendations?

I'm experimenting with OpenDocMan (PHP/MySQL, http://www.opendocman.com/) for 
storing ad hoc documents associated with part numbers in a WebERP system 
(http://www.weberp.org). system. OpenDocMan has been around for a while and 
didn't see a lot of activity after release, but seems to be pretty active 
again. We added a menu item in the WebERP ItemMaster page for a user to submit 
an associated document, which is just a link to the submit document page in 
OpenDocMan (also added a Search for Associated Documents menu item which is a 
link to a search in OpenDocMan for documents associated with that part number). 
If there are multiple documents associated with part number, the user would 
have to zip the documents and then check-in the zip archive. This concept can 
be applied to other documents, such as a received purchase order which is then 
associated with a new internal sales order and production order.

I'm also investigating using Mercurial and the Windows TortoiseHg client (or a 
simplfied custom management-and-incoming-inspection-clerk friendly client) to 
check-in an arbitrary directory structure. Users could create a local directory 
on their Windows box for mini-project work (e.g., datasheets for a 
commercial-off-the-shelf part, Word doc and graphics for a user manual, sales 
analysis spreadsheet and PowerPoint presentation, custom part drawings and work 
instruction, etc.), and when they're finished, check-in the directory. I 
think the folder check-in might be a simpler concept for casual users, but need 
to finish the strawman and get some critique.

Dale

==
Dale Scott, P.Eng.
e-mail: dalesc...@shaw.ca
http://dalescott.shawwebspace.ca


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: PDF storage software recommendations?

2010-06-17 Thread Polytropon
On Thu, 17 Jun 2010 11:22:37 -0400, Michael W. Lucas 
mwlu...@blackhelicopters.org wrote:
 Hi,
 
 I have to store a bunch of PDFs of orders.  I'd like to be able to
 tag these by customer, date, and a couple of other characteristics,
 and then search and/or sort by these tags.
 
 I'm certain that we have something in ports that will do this, but
 danged if I can find a good candidate.  While I'm sure I could build a
 database/PHP app that would work, surely someone's already done this?
 Any recommendations?

Maybe my answer will sound low level, but it works - REALLY works -
and works with mostly every kind of data.

Basically, you need to keep two things in mind:
1. PDF file filenames
2. a CSV database with a known format.

Let's say you don't care much for the PDF file names. It's okay,
as you don't have to. YOu have just to make sure that there aren't
two files with the same name (but IF they are, different path
prefixes / subdirs make it possible).

Let's furthermore say you maintain a file of a format like this:

# $1: $2: $3: $4
# filename  : Customer Name : Date  : Keywords
# --:---:---:
0477763.pdf : Sixpack J. Q. : 2010-05-12: paper, plastics
76248873aT.pdf  : Meow C.   : 2009-03-18: fish, chips, beer
UF/5u7r3jh.pdf  : Woof D.   : 2010-01-05: explosives
rrw85673.pdf: Monk A.   : 2010-04-23: tissues, water

Now you can easily search it (as it is pure text), and you can
use scripts (e. g. written in awk) to obtain specific information
and perform certain actions (like calling a PDF viewer program
with one or more files you want to view, or print files that
match a certain criteria you can query for). You can use a script
to compact the database (remove the pretty printing that helps
when manually editing the file), or even sort it. The file name
can then point to a specific subtree with all the tricks you
can do on file system level.

You can also easily (!) write your own GUI wrapper for a shell
script that does
- create new entries
- edit entries
- remove entries
- search for entries
- perform actions (open in viewer, print to printer)
- add new / remove unneeded data columns
I'd even recommend using Tcl/Tk for that.

Oh, and did I mention that you can not only use this for PDF
files, but for ALL files? It's very versatile and extendible.
It doesn't tie you to a specific program. Additionally, it can
be used on many platforms this way.

You even don't need PHP or databases for that. :-)



-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org