Re: [Wikisource-l] Do we have tools for offline collaboration?

2018-04-13 Thread Bodhisattwa Mandal
On 13 April 2018 at 15:03, Nicolas VIGNERON 
wrote:


> There is BUB https://tools.wmflabs.org/bub/ but only for certains
> websites.
>

BUB is not working for more than a year.


-- 
Bodhisattwa
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] Do we have tools for offline collaboration?

2018-04-13 Thread Nicolas VIGNERON
2018-04-13 8:54 GMT+02:00 mathieu stumpf guntz <
psychosl...@culture-libre.org>:

> Good to know. I consulted the website of ABBYY and it say one option is an
> "Open license for local use on workstations", but I guess it's not a FLOSS
> license, unfortunately.
>
Not at all, read more carefully, this license is available only when you
already purchased more than 50 licenses (
https://www.abbyy.com/en-ca/finereader/licensing/ ) so at least 5000 € IIRC.

> By the way, what is the state of the affair regarding Indic languages?
>
I left that one for people more acquainted with that but it seems to work
fine.

> Do we have a central page documenting existing OCR pipeline used by the
> wikisource community?
>
Not that I know of.
And AFAIK, each Wikisource and Wikisourcerer have different systems
(sometimes small differences but sometimes big differences).

> What should I say to a contributor which come to me asking "I have this
> old PD book in my personnal library that I would like to digitalize, share
> and proofread in Wikisource, where should I start?". Do we have an online
> service, for example on tool labs, which enable to either upload or simply
> input url of a facsimile and that launch the OCR for example backed on
> tesseract?
>
There is BUB https://tools.wmflabs.org/bub/ but only for certains websites.

> Shouldn't we update our roadmap[1], or is there a more up to date document
> elsewhere?
>
Whe should write a new document.

Cdlt, ~nicolas
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] Do we have tools for offline collaboration?

2018-04-13 Thread mathieu stumpf guntz
Good to know. I consulted the website of ABBYY and it say one option is 
an "Open license for local use on workstations", but I guess it's not a 
FLOSS license, unfortunately.


By the way, what is the state of the affair regarding Indic languages?

Do we have a central page documenting existing OCR pipeline used by the 
wikisource community?


What should I say to a contributor which come to me asking "I have this 
old PD book in my personnal library that I would like to digitalize, 
share and proofread in Wikisource, where should I start?". Do we have an 
online service, for example on tool labs, which enable to either upload 
or simply input url of a facsimile and that launch the OCR for example 
backed on tesseract?


Shouldn't we update our roadmap[1], or is there a more up to date 
document elsewhere?


[1] https://meta.wikimedia.org/wiki/Wikisource_roadmap


Le 13/04/2018 à 08:28, Nahum Wengrov a écrit :
I use ABBYY Finereader, don't remember the exact version (probably 12 
or 11). I bought it a few years ago and it works perfectly for my 
language (Hebrew).


On Fri, Apr 13, 2018 at 2:22 AM, mathieu stumpf guntz 
> 
wrote:


Thank you Nahum,

Could you indicate which OCR solution you are using?


Le 26/03/2018 à 17:27, Nahum Wengrov a écrit :

I frequently work offline on he.wikisource. I download the entire
pdf file from commons to my hard drive, and OCR the page I need
myself. One can use the OCR of wikisource and download the text
too, I guess, page by page. Then I proof the text in a Word
document, open to the lower half of my screen, with the pdf open
on the upper half of the screen, where I go to the page I need
with acrobat reader, and scroll both windows down or up as needed.

On Mon, Mar 26, 2018 at 11:21 AM, mathieu stumpf guntz
> wrote:

Le 24/03/2018 à 16:22, billinghurst a écrit :

Though that would defeat the purpose of online proofreading
with account verification. Some of the true value of our
online process is that contribution builds a level of trust
and knowledge and that is reflected in both our patrolling
and the allocation of autopatrolled status.

How providing tools to make batch work offline would
interfere in anyway with that? Once the work is done, it can
be uploaded to Wikisource with whichever account the user want.

Actually, to my mind, the main benefit of the online aspect
is the peer to peer production model. Also there is no need
of a central node carrying accounts to take into account the
trust given to a particular contributor. There is digital
signature technologies such as gpg for example. Having a
central node with a web interface just makes things easier
for most users, it doesn't improve the trustability of the
environment. On the contrary, with a single point of failure,
we actually rely on a weaker solution on this regard.


 Also how would you have access to templates, and components
like that from off-line?

Well, that just show how innefecient are this tools to
continue to contribute while being offline. It's allways
possible to install Mediawiki and download required
templates, but currently this process seems way to
complicated, doesn't it.



Also we generally cannot download the images separately as
that is usually part of the later clean-up where people have
the technical skills.

I'm afraid the term "image" misguided your answer. It's seems
you interpreted that as picture elements from files, while I
was talking about this files themselves.


So yes, there is the capacity to have the text and proofread
the text, that actual checking the text against the image is
not the sole component of proofreading, and further it would
not be at all helpful for validation.

There is nothing magic about working directly in a browser.
People do download and upload all the required material
anyway, but on a page per page base. The result is just as
valid as it is done when transactions are operated on a file
repository level.

Cheers

___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org

https://lists.wikimedia.org/mailman/listinfo/wikisource-l





___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org


Re: [Wikisource-l] Do we have tools for offline collaboration?

2018-04-12 Thread mathieu stumpf guntz

Thank you Nahum,

Could you indicate which OCR solution you are using?


Le 26/03/2018 à 17:27, Nahum Wengrov a écrit :
I frequently work offline on he.wikisource. I download the entire pdf 
file from commons to my hard drive, and OCR the page I need myself. 
One can use the OCR of wikisource and download the text too, I guess, 
page by page. Then I proof the text in a Word document, open to the 
lower half of my screen, with the pdf open on the upper half of the 
screen, where I go to the page I need with acrobat reader, and scroll 
both windows down or up as needed.


On Mon, Mar 26, 2018 at 11:21 AM, mathieu stumpf guntz 
> 
wrote:


Le 24/03/2018 à 16:22, billinghurst a écrit :

Though that would defeat the purpose of online proofreading with
account verification. Some of the true value of our online
process is that contribution builds a level of trust and
knowledge and that is reflected in both our patrolling and the
allocation of autopatrolled status.

How providing tools to make batch work offline would interfere in
anyway with that? Once the work is done, it can be uploaded to
Wikisource with whichever account the user want.

Actually, to my mind, the main benefit of the online aspect is the
peer to peer production model. Also there is no need of a central
node carrying accounts to take into account the trust given to a
particular contributor. There is digital signature technologies
such as gpg for example. Having a central node with a web
interface just makes things easier for most users, it doesn't
improve the trustability of the environment. On the contrary, with
a single point of failure, we actually rely on a weaker solution
on this regard.


 Also how would you have access to templates, and components like
that from off-line?

Well, that just show how innefecient are this tools to continue to
contribute while being offline. It's allways possible to install
Mediawiki and download required templates, but currently this
process seems way to complicated, doesn't it.



Also we generally cannot download the images separately as that
is usually part of the later clean-up where people have the
technical skills.

I'm afraid the term "image" misguided your answer. It's seems you
interpreted that as picture elements from files, while I was
talking about this files themselves.


So yes, there is the capacity to have the text and proofread the
text, that actual checking the text against the image is not the
sole component of proofreading, and further it would not be at
all helpful for validation.

There is nothing magic about working directly in a browser. People
do download and upload all the required material anyway, but on a
page per page base. The result is just as valid as it is done when
transactions are operated on a file repository level.

Cheers

___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org

https://lists.wikimedia.org/mailman/listinfo/wikisource-l





___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] Do we have tools for offline collaboration?

2018-03-26 Thread Nahum Wengrov
I frequently work offline on he.wikisource. I download the entire pdf file
from commons to my hard drive, and OCR the page I need myself. One can use
the OCR of wikisource and download the text too, I guess, page by page.
Then I proof the text in a Word document, open to the lower half of my
screen, with the pdf open on the upper half of the screen, where I go to
the page I need with acrobat reader, and scroll both windows down or up as
needed.

On Mon, Mar 26, 2018 at 11:21 AM, mathieu stumpf guntz <
psychosl...@culture-libre.org> wrote:

> Le 24/03/2018 à 16:22, billinghurst a écrit :
>
> Though that would defeat the purpose of online proofreading with account
> verification. Some of the true value of our online process is that
> contribution builds a level of trust and knowledge and that is reflected in
> both our patrolling and the allocation of autopatrolled status.
>
> How providing tools to make batch work offline would interfere in anyway
> with that? Once the work is done, it can be uploaded to Wikisource with
> whichever account the user want.
>
> Actually, to my mind, the main benefit of the online aspect is the peer to
> peer production model. Also there is no need of a central node carrying
> accounts to take into account the trust given to a particular contributor.
> There is digital signature technologies such as gpg for example. Having a
> central node with a web interface just makes things easier for most users,
> it doesn't improve the trustability of the environment. On the contrary,
> with a single point of failure, we actually rely on a weaker solution on
> this regard.
>
>  Also how would you have access to templates, and components like that
> from off-line?
>
> Well, that just show how innefecient are this tools to continue to
> contribute while being offline. It's allways possible to install Mediawiki
> and download required templates, but currently this process seems way to
> complicated, doesn't it.
>
>
> Also we generally cannot download the images separately as that is usually
> part of the later clean-up where people have the technical skills.
>
> I'm afraid the term "image" misguided your answer. It's seems you
> interpreted that as picture elements from files, while I was talking about
> this files themselves.
>
> So yes, there is the capacity to have the text and proofread the text,
> that actual checking the text against the image is not the sole component
> of proofreading, and further it would not be at all helpful for validation.
>
> There is nothing magic about working directly in a browser. People do
> download and upload all the required material anyway, but on a page per
> page base. The result is just as valid as it is done when transactions are
> operated on a file repository level.
>
> Cheers
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] Do we have tools for offline collaboration?

2018-03-26 Thread mathieu stumpf guntz

Le 24/03/2018 à 16:22, billinghurst a écrit :
Though that would defeat the purpose of online proofreading with 
account verification. Some of the true value of our online process is 
that contribution builds a level of trust and knowledge and that is 
reflected in both our patrolling and the allocation of autopatrolled 
status.
How providing tools to make batch work offline would interfere in anyway 
with that? Once the work is done, it can be uploaded to Wikisource with 
whichever account the user want.


Actually, to my mind, the main benefit of the online aspect is the peer 
to peer production model. Also there is no need of a central node 
carrying accounts to take into account the trust given to a particular 
contributor. There is digital signature technologies such as gpg for 
example. Having a central node with a web interface just makes things 
easier for most users, it doesn't improve the trustability of the 
environment. On the contrary, with a single point of failure, we 
actually rely on a weaker solution on this regard.


 Also how would you have access to templates, and components like that 
from off-line?
Well, that just show how innefecient are this tools to continue to 
contribute while being offline. It's allways possible to install 
Mediawiki and download required templates, but currently this process 
seems way to complicated, doesn't it.




Also we generally cannot download the images separately as that is 
usually part of the later clean-up where people have the technical skills.
I'm afraid the term "image" misguided your answer. It's seems you 
interpreted that as picture elements from files, while I was talking 
about this files themselves.


So yes, there is the capacity to have the text and proofread the text, 
that actual checking the text against the image is not the sole 
component of proofreading, and further it would not be at all helpful 
for validation.
There is nothing magic about working directly in a browser. People do 
download and upload all the required material anyway, but on a page per 
page base. The result is just as valid as it is done when transactions 
are operated on a file repository level.


Cheers
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] Do we have tools for offline collaboration?

2018-03-25 Thread Yann Forget
FYI, Zoé on the French Wikisource works offline, and then copy-paste the
proofread text back to Wikisource.
Seeing the result, she has quite a good process, fast and good quality.
You might want to ask her how she works:
https://fr.wikisource.org/wiki/Sp%C3%A9cial:Contributions/Zo%C3%A9

Regards,

Yann


2018-03-24 20:28 GMT+05:30 mathieu stumpf guntz <
psychosl...@culture-libre.org>:

> Hello,
>
> A person in a local Wikisource workshop asked me if we could download all
> material of a specific work to proofread it offline. So download both the
> pictures and the OCRed text. Additionaly I think it would be good to
> provide tool to at least have side by side plain text and pictures.
>
> So, are you aware of anything close to such a tool? :)
>
> Cheers
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>


-- 
Jai Jagat 2020 Grand March Coordinator
https://www.jaijagat2020.org/
+91-62 60 140 319
+91-74 34 93 33 58
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] Do we have tools for offline collaboration?

2018-03-24 Thread billinghurst
Though that would defeat the purpose of online proofreading with account 
verification. Some of the true value of our online process is that 
contribution builds a level of trust and knowledge and that is reflected 
in both our patrolling and the allocation of autopatrolled status.  Also 
how would you have access to templates, and components like that from 
off-line?


Also we generally cannot download the images separately as that is 
usually part of the later clean-up where people have the technical 
skills.


So yes, there is the capacity to have the text and proofread the text, 
that actual checking the text against the image is not the sole 
component of proofreading, and further it would not be at all helpful 
for validation.


-- billinghurst

-- Original Message --
From: "mathieu stumpf guntz" <psychosl...@culture-libre.org>
To: "discussion list for Wikisource, the free library" 
<wikisource-l@lists.wikimedia.org>

Sent: 25/03/2018 1:58:32 AM
Subject: [Wikisource-l] Do we have tools for offline collaboration?


Hello,

A person in a local Wikisource workshop asked me if we could download 
all material of a specific work to proofread it offline. So download 
both the pictures and the OCRed text. Additionaly I think it would be 
good to provide tool to at least have side by side plain text and 
pictures.


So, are you aware of anything close to such a tool? :)

Cheers
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l