Chris Little wrote:
On Thu, 16 Jan 2003, Jimmie Houchin wrote:

You mean I can't reverse engineer them from the text markup itself?
You can try, but you will probably fail. It's not markup that poses a
problem. The markup is just other people's standards that we make use of. The problem is reading the binary indexes and following them from one
index into another index or another part of the same index (repeating as
necessary) and finally into the data, possibly needing to decrypt,
possibly needing to decompress along the way. Now repeat for all of the
different categories of modules we support and each different format of
module in that category.
Thanks for sharing this information.
If you don't mind I would like to ask a few more questions to continue my education. :)

I am not understanding the need of reading the binary indexes. If I can read the text and produce the text appropriately for the frontend to render to the user, then why do I need the indexes? Can't I create my own indexes based off the text?

Learning C++ and reading the code will be simpler and faster, not to
mention some aspects like decrypting (which I grant you may choose not to
support) and decompression (which you must support if you want to support
any module released or updated in the last year and a half) would be quite
impossible without using the code we use as a basis.  (I'd imagine Squeak
has some kind of zlib functionality included, but our files aren't simply
zips.)
I probably can read some C++ and possibly enough to understand what the functions/methods do. It has been awhile since I've messed with C++. So as long as don't have to remember about pointers and such. :)

Squeak can support multiple compression formats.

The Java frontend, does it use the Sword libraries or will there be Java code for reading Sword modules?
JSword is implementing routines to read Sword modules natively in Java. I don't know their progress, but it's a big task. There's also an
advantage in porting from C++ to Java in that the two are similar enough
that porting is made simpler and (if nothing else) someone who can write
Java can probably read C++ reasonably well. (I know that's a BIG
assumption of Java programmers--no offense to the JSword team, just Troy.)
Possibly so, but theres no pointers in Java.
At least thats what they say. :)
No memory management anyway.

It is quite possible I am underestimating the task of reading/parsing Sword modules. I thought the modules/text/rawtext/***/ot modules/text/rawtext/***/nt were simply text files which the Sword libraries parsed to create what was sent to the front end.
With the possible exception of RawLD, RawText is the simplest file format. It's not much challenge to write a driver to read this, but as I mentioned above, nothing is released in this format and any time a module is updated it is released in zText format. Some day when I get a free weekend, I'll convert everything out of RawText and into zText--or more likely into OSIS marked zText to kill two birds with one stone.

Even RawText is more complex than most people who open the files up assume. The verse ordering within them is arbitrary and there is no indication of where one verse begins and another ends. (This latter fact is not so important for Bibles since it's usually pretty easy to tell that verses break at a newline.) The .vss files associated with each ot/nt file are ordered according to the order of canon and they indicate the location and length of each verse record.

Simply supporting RawText would provide a decent proof of concept, but if you're only supporting Sword modules because of the existing library of books then it isn't optimal to only support a quickly disappearing segment of that library of books.
Ok. I haven't looked at any ztext modules. I didn't understand why we had rawtext and ztext. I looked at the rawtext because thats where the KJV is.

I want whatever I do to be future capable. If it is more difficult than I currently know to read Sword Modules directly, then I'll just interface them (Sword libs) in a plugin or FFI. That's how Squeak gets some of its capabilities. So it isn't a foreign thing. If I go that route, I'll just have to hope that on any new platform for the Squeak Sword frontend that someone can compile the plugin and libs.

If I understand correctly, at some point in the future there will be no rawtext directory? That could change the picture significantly. Then I could definitely see the need for viewing source (C++).

Thanks again for your help.

Jimmie Houchin

_______________________________________________
sword-devel mailing list
[EMAIL PROTECTED]
http://www.crosswire.org/mailman/listinfo/sword-devel

Reply via email to