You can use Ghostview menu Edit|Text Extract.
----- Original Message ---- > From: Patrick van Beek <[EMAIL PROTECTED]> > To: Programming forum <[email protected]> > Sent: Wednesday, October 15, 2008 10:17:00 AM > Subject: Re: [Jprogramming] JDB with no harddisk caching > > Hi Alex > > Is there a J package which allows you to import data PDF files into J > already? I have a 1250 page dump from an admin system which I would dearly > to like to extract data from. If I could write a scrip tin J to pull out > the data I am interesting in that would be great. > Patrick > > 2008/10/8 Alex Rufon > > > Hi, > > > > I guess another approach in designing the complete in-memory database is > > on how I expect to use it. > > > > So here is a scenario where I would use an in-memory database. > > > > First off, I need to import data from multiple sources, MS-SQL, Oracle, > > SAP, AS400 export text files, EDI files, Excel File, PDF, etc. All of > > these data can be imported into a J session but the structures are > > fundamentally different. I need to be able to read this files and store > > them in a consistent structure. I now then process these data and being > > able to get "related" data as a SQL-like syntax would not hurt a bit. It > > would help in readability and support/debugging later on. After > > processing, I normally would make an SQL statement from the resulting > > data or export the result onto another medium. All of the imported data > > are then discarded. > > > > So from the paragraph above, you can deduce the following: > > 1. I need a way to import data from different sources into a standard > > structure. > > 2. I need a consistent way of accessing data from different sources. > > 3. Being able to relate the data to each other is helpful. > > 4. I don't need to store the data and it will be discarded after > > processing. > > 5. Saving the data or its result does not need the JDB facility. > > > > The first 3 items can be done now with the existing JDB code. Wit the > > above scenario, I find it unnecessary to create the DB folder and its > > underlying files and folders since they will be re-created each time the > > process is run. > > > > I also don't expect to import large amount of data because of its > > transient nature. If I would need to work with large data ... I will go > > with the regular JDB system. > > > > I really haven't thought about Views but I would agree that they would > > lend better as snapshots. In the context of in-memory database, they > > would not be that useful though. > > > > Rather, as an enhancement to the standard JDB I would suggest the > > following: > > 1. Views - In my experience, views are slow operations. Particularly > > when you create a view using another view. One of the policies that we > > implemented in our software development is to stay away from views and > > consider using stored procedures if you need views on views. Still, I > > believe that there are legitimate uses for them and should be > > supported. > > 2. Triggers - for full blown applications, particularly with centralized > > databases, triggers are indispensable tools in applying business rules. > > Of course there are always two sides to a coin but I believe that JDB > > should support triggers. > > > > r/alex > > > > On Tue, 2008-10-07 at 13:24 -0700, Oleg Kobchenko wrote: > > > > From: Alex Rufon > > > > > > > > > > > Hi Chris/Oleg, > > > > > > > > I have the following questions for JDB: > > > > 1. Would it be possible to use JDB without creating the physical file > > > > structure? I.e. Folders and files. > > > > > > This is an interesting and question which was discussed. > > > There are a few approaches, but we need to work out a good > > > and clear design. There a few aspects: > > > > > > A. Completely in-memory database, which is different from > > > regular JDB database only in that the column-nouns are not > > > mapped into physical files. > > > > > > B. Ability to have in-memory (temporary) tables along with > > > regular mapped tables in regular JDB, like session temporary > > > tables in RDBMS. But so that they are able to have foreign keys > > > to regular mapped tables. > > > > > > C. Should temorary in-memory tables be represented exactly as > > > regular mapped or as a simpler alternative, such as a boxed > > > array with column names and column values? > > > > > > D. Concept of Views: they are treated as tables, but > > > have featherweight implementation: contain a query, which > > > has column aliases. It is realized into a Snapshot with > > > list of autoids from the base table, but columns values > > > are retrieved on demand from the underlying actual tables. > > > > > > E. The Views can be permanently defined in the database along > > > with other tables; also transient Views can be used in > > > a nested query. > > > > > > used to represent the transient inner > > > result in a nested query. > > > > > > So we need to elaborate on these points. > > > > > > > 2. How do I find out the existing tables in a database without reading > > > > the "dir" file? > > > > > > Documentation is updated. > > > > > > > 3. Isn't it logical to get a "Read" verb at the table level? > > > > > > It was there originally, but then decided to have uniform > > > access through database interface. Insert should also be done > > > through database. This improves level of granularity. > > > > > > > > > > > > > To explain > > > > further, consider the following example: > > > > NB. test table definition > > > > testfields=: 0 : 0 > > > > field1 int > > > > field2 varchar > > > > field3 char > > > > field4 boolean > > > > ) > > > > > > > > NB. Load the library and open may data directory > > > > load 'data/jdb' > > > > hf=: Open_jdb_ '/home/arufon/Temp/data' > > > > NB. Create a new database named test > > > > hd=: Create__hf 'test' > > > > NB. Create a new table named test > > > > ht=: Create__hd 'test';testfields > > > > NB. Insert a sample data > > > > Insert__ht 1;'first row';'a';0 > > > > NB. Retrieve the data from the table > > > > Reads__hd 'from test' > > > > +------+---------+------+------+ > > > > |field1|field2 |field3|field4| > > > > +------+---------+------+------+ > > > > |1 |first row|a |0 | > > > > +------+---------+------+------+ > > > > > > > > What I am saying is that to read the data from the table, I need to > > > > access the database locale 'hd' while I already have a reference for > > the > > > > actual table locale in 'ht'. > > > > > > > > I can already do Insert on the table level as show in this code: > > > > NB. Insert a sample data > > > > Insert__ht 1;'first row';'a';0 > > > > > > > > I would be logical to be able to do the other DML commands on both the > > > > table and database level right? > > > > > > > > Let me know what you think. :) > > > > > > > > r/alex > > > > > > > > > > > > -- > > > > "The right questions are more important than the right answers to the > > > > wrong questions." > > > > -Dr. John Romagna > > > > ---------------------------------------------------------------------- > > > > For information about J forums see http://www.jsoftware.com/forums.htm > > > > > > > > > > > > > > > ---------------------------------------------------------------------- > > > For information about J forums see http://www.jsoftware.com/forums.htm > > -- > > "The right questions are more important than the right answers to the > > wrong questions." > > -Dr. John Romagna > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
