Re: [sword-devel] Added versification system

2024-09-23 Thread Arnaud Vié
Hi,

This sounds like a good opportunity to remind that, in February, I proposed
a set of principles to redesign the way versifications work - with the
explicit goal of being able to dynamically build uncompromised
versification systems for every document, while retaining accurate
versification mappings for text study across different bibles.

cf. http://crosswire.org/pipermail/sword-devel/2024-February/049957.html

I was, and still am, hoping to get feedback on the core principles before
building a more detailed specification, that we could then implement in
sword.

Cheers,

Arnaud
___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page


Re: [sword-devel] Fwd: Making it easier to import OSIS documents in sword

2024-07-05 Thread Arnaud Vié
Thanks Peter for the answer !

About half of the conf file entries are derived from the module and we have
> tools to automatise this , but conversely the other half is not. FWIW, the
> derived part is mostly that which is needed to function , while the not
> automatic one is needed to inform the reader .
>
The thing is, from what I can see pretty much all fields from the conf file
have an equivalent header specified in OSIS.
(Except for the technical fields specific to the sword module itself, like
the data path, driver, compression method, etc.)

So it would be nice, at least, if osis2mod was able to output a module
archive directly, including the conf file and the folder structure, so we
wouldn't have additional manual actions.

Modules can have a large number of versifications which means that your
> blob of text in the last verse goes away, mostly.
>

I don't understand what you mean by that. Do you mean that a single bible
module can have several distinct versifications for different books ?
Because that seems contrary to everything I've seen so far.

There have been discussions re OSIS direct and the thought was this not
> useful


Do you have a link to these discussions ? I'm curious about the arguments.

I'd argue that it only appears "not useful" due to the force of habit
restricting your imagination ;-)
Quite similarly to our former discussion in jsword, where you considered it
"feature-complete" because it matched the use cases identified many years
ago - whereas in practice every current user has forked it to keep evolving.

The complexity of the current format practically enforces a single workflow
to work with modules : the current workflow where there are "expert"
repository owners, all with technical knowledge to generate and host
modules - and where each module is slowly built and updated with care.
On the contrary, accepting files in clear and easy to write standard format
(here, OSIS) would open up the usage drastically in ways you do not
currently imagine - like my current objective of generating documents
dynamically from remote sources and have them immediately ready to import.


Best regards,

Arnaud
___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page


[sword-devel] Fwd: Making it easier to import OSIS documents in sword

2024-07-02 Thread Arnaud Vié
Hi all,

Has anyone given any thought to simplifying the import of OSIS documents in
sword ?

With my bible-scraper, I'm giving users a way to easily generate OSIS
documents.
The next step is to allow them to easily import the resulting document in
sword... But the current process is quite painful in this regard :

   - Usage of osis2mod CLI, with relatively obscure options, and manual
   writing of a module conf file, is reserved to a "technical elite".
   Unless I'm missing something,
*non-technical users have no easy way to import an OSIS document into
   sword. *
   - Even if I want to develop a simpler frontend hiding this complexity,
   ideally browser-based, osis2mod being distributed as a binary makes it *hard
   to integrate into a portable frontend* to automate the process.

I have a few ideas to improve this situation, on which I would like your
opinion - as well as historical context where appropriate.

*Strategy 1 : Rewrite or recompile osis2mod in a more portable fashion*

For example, it may be possible to represent most of the XML structure
changes done by osis2mod (described here
, implemented here
)
as an XSLT sheet or similar. This would make it easy to write portable
osis2mod implementations (in java, JS...) without duplicating the
maintenance for all this transformation part.

A smaller impact variant would be to keep the osis2mod code mostly
unchanged, but compile it into a WASM module using emscripten, that could
be executed natively by web browsers. I have yet to try this, though.

*Strategy 2 : Allow libsword/jsword to consume OSIS documents directly*

OSIS is a well-documented, mostly well-specified and readable open format,
whereas "sword modules" are much more tied to one specific implementation
(osis2mod).
By accepting OSIS documents in input, instead of only sword modules, we
would be moving from a mostly closed environment to a truly open one.

I understand that the transformations/normalisations/indexes computed by
osis2mod have a purpose to improve the runtime efficiency of accessing the
bibles (not decompressing and loading in RAM a full bible all the time,
etc.), so I'm not suggesting we completely get rid of them.
However, they could be taken care of at "module installation" time by the
lib itself.

The lowest-impact change for libsword would be :

   - Embed osis2mod logic into libsword core
   - Update InstallMgr::installModule
   
   to no longer require a "mods.d", but also accept archives containing a
   single OSIS XML document.
   In that case, plug the call to osis2mod logic to process the OSIS
   document and generate the actual modules.

With this, the installation of a such an OSIS module would take a few more
seconds than for the usual modules, but in exchange would make the whole
ecosystem easier to interact with.

The problem here, of course, is that we'd have to duplicate that logic into
jsword - unless we're also making it more portable as per solution 1.

*What are your thoughts on these two strategies ?*

I'm also interested in *any historical insight on this sword module format*,
which at first glance seems much more complex than it needs to be.
For example :

   - What is the purpose of offering multiple compression formats ? (half
   of which are not supported in the debian libsword builds by the way)
   - Why does osis2mod force bibles to fit into a versification (squashing
   all remaining texts into the last verse of a chapter) instead of building a
   specific index that accurately represents the contents of the original OSIS
   document ?
   - Why are contents always split by testament (ot/nt.bzs/v/z) ? Seems a
   bit arbitrary, especially since OSIS allows any kind of bookGroups.


Thanks, and sorry for yet another very long email !

Cheers,

Arnaud
___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page


Re: [sword-devel] Introducing the Bible Scraper

2024-06-03 Thread Arnaud Vié
Sorry Cyrille, I'll keep the repository in my Github personal account for
the time being.

The main reason is that the scraper is still evolving in a legal grey area,
by allowing people to save and convert copyrighted contents - since I
intend to provide parser configuration yaml files for as many websites as I
can, to eventually make more and more bibles usable in AndBible and the
sword ecosystem.
I've done enough research to be confident I'm safe as per French law, but
as I integrate parsers for more bibles from websites in other countries,
there might be complaints. If that happens, it's much better if such
complaints target me alone as it's my personal project, and do not affect
CrossWire as a whole - especially since CrossWire does not really operate
under French jurisdiction and thus might not be as protected as I am.
As Donna said, it's perfectly fine if you want to keep a fork elsewhere,
but I'd suggest making it private, not publicly affiliated with CrossWire.

In addition to that, we had a discussion on the Github vs Gitlab topic a
few months ago (cf.
http://crosswire.org/pipermail/sword-devel/2024-February/049943.html ), and
I still believe that having some lively OSIS and Sword related projects on
Github will improve the visibility of the Sword ecosystem to attract new
developers in the long run, more so than Gitlab.

(On that topic, my proposal to take over and rejuvenate the GitHub
crosswire project, specifically the jsword repo, and adding a new repo for
the OSIS specification, still stands.)

Cheers,

Anraud


Le dim. 2 juin 2024 à 16:33, Fr Cyrille  a écrit :

> Hi Arnaud,
> What do you think to move bible-scraper from github repo to our gitlab
> repo? I did this but not with the last commits. I make you dev on it.
> https://gitlab.com/crosswire-bible-society/bible-scraper
>
> Le 02/06/2024 à 11:46, Arnaud Vié a écrit :
>
> Thank you both for your interest !
>
> > What about commentary?
> > https://www.awmi.net/reading/online-bible-commentary/
>
> Not yet, I'm really focusing on bibles for the time being - that's a lot
> of work already !
> But nothing prevents adapting the solution to commentaries in the future,
> I'll keep that idea in mind :-)
>
> > If you want to use CzeBKR as your test case, I am ready to help
> > you with any testing or Czech issues or whatever
>
> Thanks a lot !
> I've just pushed a scraper configuration for this bible :
> https://github.com/UnasZole/bible-scraper/blob/master/src/main/resources/scrapers/GenericHtml/KralickaWikisource.yaml
> Main books were easy to parse - deuterocanonical books extracted from a
> different manuscript were a bit messier.
> I made a few assumptions (I interpret italics in verse as translation
> additions, and side notes in deuterocanonical books as section titles, etc.)
> Feel free to test it : after checking out and building the repository, you
> should just need to run for example:
>
> > ./run.sh scrape -s GenericHtml -i KralickaWikisource -b Ps -c 1 -w USFM
>
> Cheers,
>
> Arnaud
>
> Le dim. 2 juin 2024 à 08:50, Matěj Cepl  a écrit :
>
>> On Sun Jun 2, 2024 at 1:09 AM CEST, Arnaud Vié wrote:
>> > I'm open to any kind of feedback or suggestions of course !
>> > In particular :
>> >
>> >- if you have any specific website in mind that you would like to be
>> >able to build sword modules from, let me know, we can try to add it.
>> >(Currently I only included a few French websites, but I'm interested
>> to add
>> >some other languages).
>>
>> Sword module CzeBKR is sourced from the Czech WikiSource [1]
>> and there seems to be the official way [2] how to get source
>> in some hopefully more useful formats (plain text, RTF, HTML,
>> EPubs). I was using my own home-grown Python script [3], but it
>> seems like with all web-scrapping scripts it rotten away (that
>> script is under some of kind of very free open source license,
>> let’s say MIT/X11 … I am going to add the proper LICENSE file
>> momentarily). It started at [4] (look at the source view), but it
>> doesn’t seem to be that useful anymore.
>>
>> >- And if you are knowledgeable about the intellectual property laws
>> in
>> >other countries, I'm interested : currently, I've added a section to
>> the
>> >README explaining why the usage of the scraper on any public website
>> is
>> >allowed in France with references to the related texts, but it would
>> >probably be useful to have similar information for users from other
>> >countries.
>>
>> I am absolutely certain, there are no problems with CzeBKR:
>>
>>   

Re: [sword-devel] Introducing the Bible Scraper

2024-06-02 Thread Arnaud Vié
Thank you both for your interest !

> What about commentary?
> https://www.awmi.net/reading/online-bible-commentary/

Not yet, I'm really focusing on bibles for the time being - that's a lot of
work already !
But nothing prevents adapting the solution to commentaries in the future,
I'll keep that idea in mind :-)

> If you want to use CzeBKR as your test case, I am ready to help
> you with any testing or Czech issues or whatever

Thanks a lot !
I've just pushed a scraper configuration for this bible :
https://github.com/UnasZole/bible-scraper/blob/master/src/main/resources/scrapers/GenericHtml/KralickaWikisource.yaml
Main books were easy to parse - deuterocanonical books extracted from a
different manuscript were a bit messier.
I made a few assumptions (I interpret italics in verse as translation
additions, and side notes in deuterocanonical books as section titles, etc.)
Feel free to test it : after checking out and building the repository, you
should just need to run for example:

> ./run.sh scrape -s GenericHtml -i KralickaWikisource -b Ps -c 1 -w USFM

Cheers,

Arnaud

Le dim. 2 juin 2024 à 08:50, Matěj Cepl  a écrit :

> On Sun Jun 2, 2024 at 1:09 AM CEST, Arnaud Vié wrote:
> > I'm open to any kind of feedback or suggestions of course !
> > In particular :
> >
> >- if you have any specific website in mind that you would like to be
> >able to build sword modules from, let me know, we can try to add it.
> >(Currently I only included a few French websites, but I'm interested
> to add
> >some other languages).
>
> Sword module CzeBKR is sourced from the Czech WikiSource [1]
> and there seems to be the official way [2] how to get source
> in some hopefully more useful formats (plain text, RTF, HTML,
> EPubs). I was using my own home-grown Python script [3], but it
> seems like with all web-scrapping scripts it rotten away (that
> script is under some of kind of very free open source license,
> let’s say MIT/X11 … I am going to add the proper LICENSE file
> momentarily). It started at [4] (look at the source view), but it
> doesn’t seem to be that useful anymore.
>
> >- And if you are knowledgeable about the intellectual property laws in
> >other countries, I'm interested : currently, I've added a section to
> the
> >README explaining why the usage of the scraper on any public website
> is
> >allowed in France with references to the related texts, but it would
> >probably be useful to have similar information for users from other
> >countries.
>
> I am absolutely certain, there are no problems with CzeBKR:
>
> 1. It is WikiSource, so we have somebody else to blame ;)
> 2. The original Bible of Kralice [5] is from the sixteenth
>century and it is absolutely in the public domain.
> 3. Source for the WikiSource was a scan [6] of the book
>from 1918, without any authors shown. The works of only
>possible editor of that Bible I know about [7] (and he is
>not shown on the title page, but he was working in the
>early 20th century with the International Bible Society on
>the revision of the Bible) are under the Bern Convention
>(death in 1929 + 75 years) in the public domain as well.
> 4. We are in EU as well.
>
> If you want to use CzeBKR as your test case, I am ready to help
> you with any testing or Czech issues or whatever.
>
> Blessed Sunday!
>
> Matěj
>
> [1] https://cs.wikisource.org/wiki/Bible_kralick%C3%A1_(1918)
> [2]
> https://ws-export.wmcloud.org/?lang=cs&title=Bible_kralick%C3%A1_%281918%29
> [3]
> https://gitlab.com/crosswire-bible-society/CzeBKR/-/blob/master/kralicka.py
> [4]
> https://cs.wikisource.org/wiki/Speci%C3%A1ln%C3%AD:Exportovat_str%C3%A1nky/Bible_kralick%C3%A1_(1918)
> [5] https://en.wikipedia.org/wiki/Bible_of_Kralice
> [6] http://archive.org/details/biblsvatanebvec00socigoog
> [7] https://cs.wikipedia.org/wiki/Jan_Karafi%C3%A1t
> --
> http://matej.ceplovi.cz/blog/, @mcepl@floss.social
> GPG Finger: 3C76 A027 CA45 AD70 98B5  BC1D 7920 5802 880B C9D8
>
> The ratio of literacy to illiteracy is a constant, but nowadays
> the illiterates can read.
> -- Alberto Moravia
>
> ___
> sword-devel mailing list: sword-devel@crosswire.org
> http://crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
>
___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page


[sword-devel] Introducing the Bible Scraper

2024-06-01 Thread Arnaud Vié
Hello all,

Cyrille already teased it in some of his previous mails on this list, but
I've been working for several months on a tool to scrape bibles from any
web page into a standard format (OSIS and USFM outputs are supported) : the
Bible Scraper.
It mostly serves two purposes :

   - *Help converting "loosely formatted" bibles, such as bibles
   transcribed from facsimiles on wikisource, to a standard semantic format.*
   These bibles usually have some light formatting that aims at replicating
   the visual appearance of the original document, but without a strong
   semantic markup. With proper configuration, the scraper can convert those
   to a fully formed OSIS or USFM document, as long as the formatting is
   consistent throughout the bible.
   This is the usage Cyrille has been experimenting a lot recently, and
   with which we have been achieving promising results.

   - *Allow individual users to convert bibles, which are freely available
   on the web but which we don't have the rights to redistribute, into sword
   modules for their personal usage*.
   This relies on the right to personal copy, which is quite strongly
   upheld in French law (and probably most other european countries, as there
   are texts on the topic from the CJEU as well) : as long as a user has
   legitimate access to the contents he wishes to copy, he is allowed to
   download and process it for personal use. Since the scraper is just
   software that any user can run on his own machine, there is no intermediate
   that could be accused of illegitimate "redistribution" in any form.

In its current state, the tool is still mostly targeted at developers (I
don't yet publish a downloadable artifact, so interested users have to
clone the git repo, and run a maven build), but it's becoming mature enough
to be shared with those who want to have a look :
https://github.com/UnasZole/bible-scraper

I'm open to any kind of feedback or suggestions of course !
In particular :

   - if you have any specific website in mind that you would like to be
   able to build sword modules from, let me know, we can try to add it.
   (Currently I only included a few French websites, but I'm interested to add
   some other languages).
   - And if you are knowledgeable about the intellectual property laws in
   other countries, I'm interested : currently, I've added a section to the
   README explaining why the usage of the scraper on any public website is
   allowed in France with references to the related texts, but it would
   probably be useful to have similar information for users from other
   countries.

Thanks all and best regards,

Arnaud
___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page


Re: [sword-devel] Catholic versification / inter-versification mappings

2024-02-20 Thread Arnaud Vié
 by publishers.  Could we build a system
> which allowed out of order verses, or which allowed any scheme a Bible
> wished to follow?  Sure, but the added complexity for various tasks
> increases quite a bit for some of these allowances-- e.g., think index math
> for book chapter verse when we cannot assume numeric sequence; think
> abstract ordering of bookmarks not tied to any specific Bible, search
> results across Bibles, etc.
>
> Our vision with v11n definitions is that they will be a few as possible
> allowing us to map between them most easily; and as many as necessary to
> allow us to represent well enough a published work.
>
> Chris Little previously was our versification pumpkin holder and did some
> amazing work researching all this material.  As a demonstration of his
> thorough work and an example of the difficulties with v11n, see his work on
> just the LXX tradition:
>
> https://www.crosswire.org/svn/sword-tools/trunk/versification/lxx_v11ns/
>
> Chris has left our community after many years of volunteering massive time
> and effort.
>
> We haven't had anyone step up who is willing to commit the time and effort
> necessary and who holds our vision (as few as possible, as many as
> necessary).
>
>
> 2) MAPPINGS:
>
> SWORD and JSword support v11n to v11n mappings. Graciously, Костя Маслюк
> worked with us for over a year to discuss the problems and implement a
> versification mapping system which has been included in the engine.  He
> also added v11n mappings for systems he was interested in supporting. If
> anyone is interesting in the discussions, they can see the archives, e.g.,
>
> https://www.crosswire.org/pipermail/sword-devel/2013-July/040154.html
>
> Historically, we called this topic alternate versification, so if you see
> "av11n" in the archives, you'll be aware.
>
> Registering v11n systems and mappings in the engine is straightforward in
> our versification manager, and as you can see, loading these dynamically
> from a file would be simple to implement:
>
> https://www.crosswire.org/svn/sword/trunk/src/mgr/versificationmgr.cpp
>
> During the development of the mapping infrastructure, our proof of concept
> was to see if we could concisely build a parallel Bible HTML display across
> versification systems, letting a user specify any number of Bibles, the
> first Bible v11n being the primary ordering driver:
>
> https://www.crosswire.org/svn/sword/trunk/examples/tasks/parallelBibles.cpp
>
> 
>
> So, now to the issue with adding another Catholic versification system.  I
> would love to continue to delegate ownership of v11n decisions!  I trusted
> Chris.  He said "no" all the time, and only allowed new versification
> definitions if we really couldn't support a set of Bibles using an existing
> system with our work arounds.  He spent the time necessary to understand
> the traditions, which published works would use the proposed versification,
> he had excellent skills clearly delineating systems-- generally he made
> well informed decisions from many hours of research.
>
> I don't understand the complex details nor have the time to do the
> research for each individual request.  My first thought is, where is
> Chris?!  Next, my uninformed mind thinks: we have v11n definitions
> "Catholic" and "Catholic2"! Why do we need an additional Catholic
> versification system?  Did we do a bad job with the first two?  Can we not
> follow our principles and create a superset between 2 or more of these?
> And of course, these are not proper responses.
>
> So, if anyone is prayfully willing to take up this pumpkin-- to put in the
> time necessary to research Bible traditions and published works, to truly
> understands both the pros and cons of the decisions we've made to go down
> the path we are on, along with our workarounds for the cons, and is willing
> to live wholeheartedly with where we are now, but certainly always open to
> improve, I would love for that person to take ownership of versification.
>
> I appreciate the pointers to the Paratext v11ns and mappings, maybe we
> compare where we are now with what they have.
>
> Thank you all for being zealous to improve things.  Looking forward the
> conversation to follow,
>
> Troy
>
>
>
> On 1/26/24 06:10, Arnaud Vié wrote:
>
> Hello everyone,
>
> I'm the person Cyrille mentioned, and I just joined the mailing list as I
> thought I could maybe explain a bit more what I'm trying to do with this
> new versification.
>
> *1. Problem statement*
>
> Simply put, my objective is to be able to align verse-by-verse the
> contents of two bibles that use different versifications.
> 

[sword-devel] Designing a modular versification system

2024-02-19 Thread Arnaud Vié
e, it can be any other versification.

*Practical application*

The practical application of these principles leads to the following setup :
- One specific "root" versification can be chosen by CrossWire and embedded
in sword, to be used as central point for mapping. That could be KJVA (as
it's already the current central point for mapping) by referencing a
specific edition of KJVA.
- A small set of "major" versifications are defined by CrossWire and
embedded in sword, along with an accurate mapping to KJV. These major
versifications should be for versions that we consider "very influential",
ie many bibles mostly follow their verse splits (ex. Rahlfs LXX, possibly
one MT version, etc.)
- Finally, each bible can either reuse one of these major versifications
directly, or embed a custom versification built by
substracting/aggregating/mapping to any of the major ones.

This allows each bible to accurately define its own versification without
ambiguity, while still inheriting as much of the mappings as possible from
the "major" versifications.

For example, one bible may use Rahlf's LXX for all books except Esther, and
define a specific versification for Esther with explicit mapping to KJVA.

Other example : we no longer need to explicitly maintain NSRV and NSRVA :
it's very easy for these bibles to just reuse KJVA with one small mapping
for the only difference.


And that's all for today, I think that description is long enough already !

Let me know your thoughts !
If we have a consensus on these principles, we can then start working on
defining an actual format.

Regards,

Arnaud Vié
___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page


Re: [sword-devel] Making better use of the CrossWire GitHub project ?

2024-02-18 Thread Arnaud Vié
history and analysis of why this is published in Balisage 2021
> conference:
>
>
> https://www.balisage.net/Proceedings/vol26/html/Robie01/BalisageVol26-Robie01.html
>
> Even in 2024, the tagging language USFM remains the "primary" tool to
> encode biblical works at almost all the organizations that produced OSIS.
> There is no momentum for that committee to ever meet again. But the spec
> has holes.
>
> https://gitlab.com/cmahte/osis-users-manual-2.1
>
> I started working on updating OSIS, and in the process received a reply
> from someone at ABS or UBS that although the OSIS spec is copyrighted and
> does not contain specific verbiage about reuse, I could and should consider
> it licensed under creative commons BY-SA. (At the time, I wasn't seeking to
> update OSIS, but freely copy from it in creating a successor or fork.)
>
> This means that OSIS is both abandoned and available for adoption by a
> successor body.  I've also since moved on from ever producing proposed
> changes to it or a fork myself. IF I ever got far enough along to need a
> formal spec, it would be extensions USFM or to OpenDocument or more
> directly synonymous with that XML.  If you're interested, I'll dig up the
> contact information, and pass it along. But I do have a copy re-edited into
> USFM (or more specifically a draft version of PSFM... which means the way
> tables are built in my text are unusual.) If there is an effort to update.
> I can transform my work into LibreOffice Writer format.
>
> I suggest it is time to consider an OSIS 3, or at least an OSIS 2.2 spec
> that is owned by a successor organization instead of organizations that
> effectively abandoned it.  That's the missing link which would provide a
> mechanism to actually make changes to the standard.  People (including me)
> keep doing this search and landing at Crosswire Bible society as the best
> option for a new owner. But maybe who OWNS can be one of the topics
> considered by a committee.
>
> On Sat, Feb 17, 2024 at 9:47 AM Arnaud Vié 
> wrote:
>
>> Hi everyone,
>>
>> Having dived into the whole crosswire ecosystem recently, I'm at the same
>> time impressed at the quality of the tools provided (in particular the OSIS
>> standard and the JSword lib, as I've been working in Java), and worried by
>> what I perceive as a lack of dynamism around it's development and
>> difficulty to contribute.
>>
>> By "lack of dynamism" I of course don't mean to criticise the time anyone
>> spends (as we contribute to a free ecosystem, we all have lives keeping us
>> busy elsewhere), but rather to highlight how rough it is for external
>> enthusiastic people to join.
>> For example, I'd like to contribute evolutions to the OSIS standard
>> around versification systems, but I have no idea where to make such
>> proposals, as there is only a mailing list dead since 2015
>> <http://crosswire.org/pipermail/osis-core/>, a few wiki pages
>> <https://wiki.crosswire.org/Category:OSIS> and a few downloadable
>> documents <https://crosswire.org/osis/> which are supposedly the latest
>> version.
>>
>> I think a lot of that could be improved by making better use of the
>> crosswire github project <https://github.com/crosswire>, which is
>> nowadays the first contact most young developers will have with these
>> crosswire projects.
>>
>> I'd like to propose a few changes, get your opinions, and volunteer to
>> execute them if everyone agrees.
>>
>>- *Revive the jsword github repository*.
>>That includes
>>   - Backporting the relevant changes from the andbible fork
>>   <https://github.com/AndBible/jsword/pulls?q=is%3Apr+is%3Aclosed>
>>   (excluding android-specific stuff - which I already mostly removed in 
>> my
>>   last PR there).
>>   - Setting up a release process to publish the jar on a maven
>>   repository.
>>   - Setting up a clear branching model and writing clear
>>   contribution guidelines.
>>   - Having a team of several people familiar with Java development
>>   to review PRs or answer questions in the issue tracker. I obviously
>>   volunteer, but more people is always the best.
>>
>>   - *Create a new Git repository for the OSIS specification*.
>>Must contain :
>>   - In Git, the OSIS XSD schema, and the functional specification
>>   (basically, the contents of the current manual) in markdown or asciidoc
>>   format.
>>   So that contributions to the standard may be opened as pull
>>

Re: [sword-devel] Making better use of the CrossWire GitHub project ?

2024-02-18 Thread Arnaud Vié
Thanks Michael for all this information !
I was not aware of this Copenhagen Alliance's work - I just had a quick
look at the repository you linked, and it seems to cover a significant part
of the requirements I wanted to cover with my proposal, but not all.
I like the format, and the fact that it includes the ability to specify
mappings directly within the json definition.
However, from what I understand, it's not modular : each versification must
still be completely defined in a single file, and mappings are always
relative to the single "org" versification.

Anyway, I'll keep that in mind once I present my general idea here - maybe
I should also present it to this Copenhagen Alliance as a possible
evolution of their format.
Then, the change in OSIS would just be to allow refSystem to point to such
a json file, and for SWORD to allow packaging this file within the SWORD
module (and reading and understanding it, of course) - we wouldn't need to
define yet another standard.

Le dim. 18 févr. 2024 à 21:47, Michael H  a écrit :

> And, It appears "unspecified" is no longer true for the .vrs files.
>
> https://github.com/Copenhagen-Alliance/versification-specification
>
> I don't think it's pulled back into Paratext yet, but this is an actual
> "spec" to look at to understand the USFM ... not USFM, but
> Paratext/UBS/SIL/Bible Translation community approach to mapping
> versifications.
>
> https://github.com/Copenhagen-Alliance/versification-specification
>
> ___
> sword-devel mailing list: sword-devel@crosswire.org
> http://crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
>
___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page


Re: [sword-devel] Making better use of the CrossWire GitHub project ?

2024-02-18 Thread Arnaud Vié
'm still formalising the proposal (writing an accurate description of all
the principles and objectives behind it), but I'll open a dedicated thread
in this mailing list very soon.

Cheers,

Arnaud

Le dim. 18 févr. 2024 à 12:35, Troy A. Griffitts  a
écrit :

> Dear Arnaud and others,
>
> Peter has done a good job summarizing.
>
> Yes, last year we had a discussion on the CrossWire and git topic and you
> can see the discussion in the mail archives here.
>
> https://crosswire.org/pipermail/sword-devel/2023-March/subject.hYes, last
> year we had a discussion on the CrossWire and git topic and you can see the
> discussion in the mail archives here.tml
> <https://crosswire.org/pipermail/sword-devel/2023-March/subject.html>
>
> Some progress has been made.
>
> Regarding OSIS, can you post here what proposal you would like to make? I
> am sure many people here will have comments on your idea.
>
> I am happy for your interest to get involved and for your zeal to see
> things more visible and active.
>
> Welcome,
>
> Troy
>
>
> On February 18, 2024 00:57:34 MST, Peter von Kaehne 
> wrote:
>
>> Hi Arnaud,
>>
>> It makes sense to understand some things better when seen in history:
>>
>> There are three core projects to CrossWire - libsword,  jsword and the
>> text modules - all others are independent but related users.
>>
>> The SVN site for libsword is the current, not old. It is just that very
>> little changes over long stretches. Libsword is 30 + year old and does
>> its job. Errors and bugs get corrected , big proposals happen once in a
>> long while and then come into the code. Development happens in spurts, once
>> every few years currently - but as users (other projects) are on disparate
>> platforms consensus is needed.
>>
>> Jsword is similarly old, largely feature complete and little changes Two
>> big projects use it and contribute back to it.
>>
>> Substantial internal changes would require consensus across these
>> projects at the very least.
>>
>> Most current development happens in programmes using it and in module
>> development.
>>
>> The GitLab site was created by some of us who create modules for texts
>> which are in the public domain but have little other exposure
>>
>> OSIS - crosswire is the principal user, but as it stands it is an
>> international standard, and not under our control. We do maintain some
>> internal amendments as the standard has not been updated otherwise since
>> creation.
>>
>>
>>
>>
>> Peter
>>
>> Sent from Outlook for iOS <https://aka.ms/o0ukef>
>>
>> --
>> *From:* sword-devel  on behalf of
>> Arnaud Vié 
>> *Sent:* Saturday, February 17, 2024 11:02 pm
>> *To:* SWORD Developers' Collaboration Forum 
>> *Subject:* Re: [sword-devel] Making better use of the CrossWire GitHub
>> project ?
>>
>> Thanks Matej for all the information !
>> (and your git mirrror, that will be quite helpful :-) )
>>
>> Is the gitlab project referenced anywhere on the crosswire website ?
>> Because I've been looking all over and only found the svn link ^^'
>> That's exactly the kind of problems I'm talking about when I say the
>> project's visibility could be improved, to make it more possible for new
>> people to get interested and join !
>> I don't have anything against GitLab, but GitHub is by far more popular.
>> People can randomly search for projects on GitHub - but virtually no one
>> searches for projects on GitLab if they don't already know that the project
>> is hosted there. So if we use GitLab for all development, we should at
>> least put some links in the GitHub project description and on the crosswire
>> website to tell people where to go.
>>
>>
>> Regarding the GitLab project, just like pinoaffe I can't see any
>> repository related to the OSIS specification, only bible modules and a
>> "script" repo.
>>
>>
>> And by the way, given the number of module (ie "data") repositories
>> present, another suggestion I can make is to keep the repositories related
>> to core functionality (spec, librairies, etc.) in a separate project, as
>> their contributors will likely be very different. As a developer, finding a
>> code repository in the middle of 6 pages of data repos is not very
>> convenient.
>> In that regards, it could even make sense to keep gitlab for data, and
>> use github for code - or just create a separate gitlab project for code
>

Re: [sword-devel] Making better use of the CrossWire GitHub project ?

2024-02-17 Thread Arnaud Vié
Thanks Matej for all the information !
(and your git mirrror, that will be quite helpful :-) )

Is the gitlab project referenced anywhere on the crosswire website ?
Because I've been looking all over and only found the svn link ^^'
That's exactly the kind of problems I'm talking about when I say the
project's visibility could be improved, to make it more possible for new
people to get interested and join !
I don't have anything against GitLab, but GitHub is by far more popular.
People can randomly search for projects on GitHub - but virtually no one
searches for projects on GitLab if they don't already know that the project
is hosted there. So if we use GitLab for all development, we should at
least put some links in the GitHub project description and on the crosswire
website to tell people where to go.


Regarding the GitLab project, just like pinoaffe I can't see any repository
related to the OSIS specification, only bible modules and a "script" repo.


And by the way, given the number of module (ie "data") repositories
present, another suggestion I can make is to keep the repositories related
to core functionality (spec, librairies, etc.) in a separate project, as
their contributors will likely be very different. As a developer, finding a
code repository in the middle of 6 pages of data repos is not very
convenient.
In that regards, it could even make sense to keep gitlab for data, and use
github for code - or just create a separate gitlab project for code
repositories, whatever people prefer.


Finally, for jsword, to be honest I'm not really worried about its
"organizational" status : after 5 years without breathing it's
unambiguously dead.
My request is to mostly to try to reach whoever has admin rights on the
"crosswire" GitHub project, and see if they would be willing to let me take
over jsword to refresh it :-)


Le sam. 17 févr. 2024 à 21:12, Matěj Cepl  a écrit :

> On Sat Feb 17, 2024 at 4:46 PM CET, Arnaud Vié wrote:
> > I think a lot of that could be improved by making better use of the
> > crosswire github project <https://github.com/crosswire>, which is
> nowadays
> > the first contact most young developers will have with these crosswire
> > projects.
>
> https://github.com/crosswire is mostly dead. There
> is more life (especially for modules) at
> https://gitlab.com/crosswire-bible-society and then there is
> another GitLab at https://git.crosswire.org/ (for contributors
> only).
>
> >- *Revive the jsword github repository*.
>
> jsword is organizationally in many aspects a separate project from
> libsword.
>
> >   - *Create a new Git repository for the OSIS specification*.
>
> See on gitlab.
>
> >   - Ideally, I'd also suggest *moving the C++ sword code to github*.
> >Having it only on an old SVN repo
> ><https://crosswire.org/svn/sword/trunk/>, not browsable or searchable
> >online, really harms its visibility. I used a little bit of SVN while
> in
> >engineering school 12 years ago, but I doubt that most young devs
> nowadays
> >even know about it.
>
> I don’t even comment on this one any more (just mirror it to
> https://git.cepl.eu/cgit/sword/), because where there is no
> advice, there is no help.
>
> Best,
>
> Matěj
>
> --
> http://matej.ceplovi.cz/blog/, @mcepl@floss.social
> GPG Finger: 3C76 A027 CA45 AD70 98B5  BC1D 7920 5802 880B C9D8
>
> Why should I travel, when I’m already there?
> -- Bostonian lady, when being asked why she never visited
> other places than Boston
>
> ___
> sword-devel mailing list: sword-devel@crosswire.org
> http://crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
>
___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page


[sword-devel] Making better use of the CrossWire GitHub project ?

2024-02-17 Thread Arnaud Vié
Hi everyone,

Having dived into the whole crosswire ecosystem recently, I'm at the same
time impressed at the quality of the tools provided (in particular the OSIS
standard and the JSword lib, as I've been working in Java), and worried by
what I perceive as a lack of dynamism around it's development and
difficulty to contribute.

By "lack of dynamism" I of course don't mean to criticise the time anyone
spends (as we contribute to a free ecosystem, we all have lives keeping us
busy elsewhere), but rather to highlight how rough it is for external
enthusiastic people to join.
For example, I'd like to contribute evolutions to the OSIS standard around
versification systems, but I have no idea where to make such proposals, as
there is only a mailing list dead since 2015
<http://crosswire.org/pipermail/osis-core/>, a few wiki pages
<https://wiki.crosswire.org/Category:OSIS> and a few downloadable documents
<https://crosswire.org/osis/> which are supposedly the latest version.

I think a lot of that could be improved by making better use of the
crosswire github project <https://github.com/crosswire>, which is nowadays
the first contact most young developers will have with these crosswire
projects.

I'd like to propose a few changes, get your opinions, and volunteer to
execute them if everyone agrees.

   - *Revive the jsword github repository*.
   That includes
  - Backporting the relevant changes from the andbible fork
  <https://github.com/AndBible/jsword/pulls?q=is%3Apr+is%3Aclosed>
  (excluding android-specific stuff - which I already mostly removed in my
  last PR there).
  - Setting up a release process to publish the jar on a maven
  repository.
  - Setting up a clear branching model and writing clear contribution
  guidelines.
  - Having a team of several people familiar with Java development to
  review PRs or answer questions in the issue tracker. I obviously
volunteer,
  but more people is always the best.

  - *Create a new Git repository for the OSIS specification*.
   Must contain :
  - In Git, the OSIS XSD schema, and the functional specification
  (basically, the contents of the current manual) in markdown or asciidoc
  format.
  So that contributions to the standard may be opened as pull requests,
  reviewed, potentially stored as separate branches, etc.
  - A wiki tab where all relevant OSIS-related resources from the
  crosswire wiki should be copied.

  - Ideally, I'd also suggest *moving the C++ sword code to github*.
   Having it only on an old SVN repo
   <https://crosswire.org/svn/sword/trunk/>, not browsable or searchable
   online, really harms its visibility. I used a little bit of SVN while in
   engineering school 12 years ago, but I doubt that most young devs nowadays
   even know about it.

But for this last C++ part, I suspect it has bigger impact on current
developers, since Troy is still actively developing it and using the Jira
bugtracker for this part - so there is no urgent need to change.
I'm really more worried about the jsword repo (it breaks my heart to see it
dead since 2019) and having a visible and versioned location for the OSIS
standard.

Please let me know your thoughts !
And whoever is currently admin of the github project, would you be willing
to grant me some permissions on the jsword repo and a new "osis-spec" repo
to start setting up all of this ?

Regards,

Arnaud Vié
___
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page


Re: [sword-devel] Catholic versification

2024-01-26 Thread Arnaud Vié
Hello everyone,

I'm the person Cyrille mentioned, and I just joined the mailing list as I
thought I could maybe explain a bit more what I'm trying to do with this
new versification.

*1. Problem statement*

Simply put, my objective is to be able to align verse-by-verse the contents
of two bibles that use different versifications.
For example :
- I open Daniel 3 in a Catholic bible, it has 100 verses because the Prayer
of Azariah is included.
- I want to compare the translation verse by verse with, for example, the
KJVA. This means I want to see the Daniel 3 from KJVA, with all verses from
PrAzar included in the corresponding place.

There was already logic to perform such mapping in jsword, and I recently
included it to support deuterocanonical contents (on the AndBible fork,
since that's the only one where I got answers from the maintainer) :
https://github.com/AndBible/jsword/pull/13

Now, the problem is to be able to do the same with the deuterocanonical
additions to the book of Esther, because there are many different
"strategies" adopted by different bibles.
- Protestant bibles, when they have it, usually have it a separate books
(AddEsth in KJVA).
- Some catholic bibles have it as additional chapters at the end of Esther,
making Esther 16 chapters long : that's the existing "Catholic2"
versification, which maps to KJVA easily :
https://github.com/AndBible/jsword/blob/develop/src/main/resources/org/crosswire/jsword/versification/Catholic2.properties#L392
- Most catholic bibles actually have the additions integrated directly
within the original text, using lettered verse numbers (13A, 13B, etc., see
here for example : https://www.aelf.org/bible/Est/3 )

Currently, these catholic bible with the text integrated use the "Catholic"
versification, and ignore all the lettered verses (or include the letters
as raw text) : basically, Esther 3.13 with these additions becomes one
single very long verse. This makes it impossible to map properly with the
AddEsth of KJVA.

*2. Proposed solution (Catholic3 versification)*

With the proposed Catholic3 versification (except it needs a few
adjustments compared to the file proposed by Cyrille), what I'd like to
achieve is to give a unique verse number to each of those.
For example, Esther 3 goes from 15 to 22 verses, with the OSIS IDs becoming
:
- Esth.3.13 for verse 13
- Esth.3.14 for verse 13A...
- Esth.3.20 for verse 13G
- Esth.3.21 for verse 14
(Basically, the OSIS ID identifies the position of the verse; the actual
numbering from the bible can be preserved separately with the OSIS "n"
attribute or the "\vp" USFM keyword.)

would your use-case be served if canon_catholic.h was
> modified to increase the verse counts in Esther to 39, 23, 22, 47, 29,
> 14, 10, 41, 32, 14?

Since my objective is to allow mapping verse by verse, you'll understand
that I need the verse counts to be aligned with the actual usage. Having
the "versification" allow more verses than what's actually used defeats the
purpose.
In addition, I believe it's a very bad idea to make big changes to already
published versifications : the point of versifications is to give a unique
ID to a verse. Updating a versification will change all IDs for the verses
of already existing bibles that use this versification.
I really believe the best solution for the time being is to create a new
Catholic3 versification, as originally suggested.
I can provide the full definition very soon (though since I'm working with
JSword it will be in Java format first), and it should in theory be aligned
with Catholic and Catholic2 except for these differences in Esther. (I'll
check if there are more differences I missed).

*3. Modular versifications*

I think at some point it would be nice to have per-book versifications
> or some other way to deal with bibles that don't follow a "standard"
> versification

Agreed.
If everyone is open to the idea, I'd like to work in the next few months on
an extension of the OSIS standard to define "modular" versifications, ie.
versifications that can be built by composing other versifications and
applying a diff.
Then each bible could, in its document header, not only reference a
standard versification with refSystem, but include its own specific changes
and how they map to the standard.

Before I spend time on the topic though, is there anyone in particular I
should ask to approve the general idea, and who would be interested in
reviewing proposals on the topic ?

Thanks all and best regards,

Arnaud

Le jeu. 25 janv. 2024 à 16:35, pinoaffe  a écrit :

>
> Hello,
>
> I don't know much about catholic bibles or sword, but just out of
> curiosity: would your use-case be served if canon_catholic.h was
> modified to increase the verse counts in Esther to 39, 23, 22, 47, 29,
> 14, 10, 41, 32, 14?  Or would the decreases in verse counts in other
> chapters of other books also be necessary?
>
> And would such a change be acceptable to others?
>
> The catholic bibles I've encountered "in the wild" a