Re: [translate-pootle] Pootle 2.1.0 Alpha1 snapshot release

Leandro Regueiro Mon, 12 Apr 2010 09:17:05 -0700

>>> Thanks Leandro for the links and detailed reply.
>>>
>>> I must confess I don't know much about TBX, toolkit has a storage class
>>> the supports it and Pootle can open TBX files. at the moment you can
>>> use TBX files for the server wide terminology, just delete the default
>>> terminology project and recreate it choosing TBX format instead of PO.
>>>
>>> But I doubt we support all the fancy features. Friedel and Dwayne
>>> probably know more about this.
>>
>> Ok. I am curious about which subset of TBX is supported by the storage
>> class in TTK. How can I see it without looking at the code?
>
> Hi Leandro
>
> This page is where we collect our notes about our TBX support:
>
> http://translate.sourceforge.net/wiki/toolkit/tbx
>
> As you can see, we don't support everything, although we should be able
> to read most files (except those using ntig), and will be happy to
> support more features in future. Feel free to share your ideas on what
> is important.


I think that first you should try to at least support the same basic
things that other free software already support. For example in
Lokalize there is support for definitions (really really
important!!!), and I may point to include several translations in the
same language for the same concept, part of speech, and subject field
as soon as you can.

Putting references to external resources using <xref> may be very
useful (to Wikipedia for example). You can also add grammatical genre,
grammatical number, administrative status (recommended,
deprecated,...), and process status (for marking concepts in
discussion), use in sample sentence (use descrip tag with "context"
type as TBX Basic says), related concepts (the "See also"
functionality). And IMHO you should use tig by default. And maybe you
should put IDs in termEntry and tig tags.

Here you have a tiny example:

<termEntry id="cid-23">
    <descrip type="subjectField">computer science</descrip>
    <langSet xml:lang="en">
        <descrip type="definition">A computer is a programmable
machine that receives input, stores and manipulates data, and provides
output in a useful format.</descrip>
       <xref type="xSource"
target="http://en.wikipedia.org/wiki/Computer";>Wikipedia</xref>
        <tig id="tid-59">
            <term>computer</term>
         </tig>
    </langSet>
    <langSet xml:lang="es">
        <descrip type="definition">Máquina  electrónica que recibe y
procesa datos para convertirlos en información útilr</descrip>
        <tig>
            <term>sistema</term>
            <termNote
type="administrativeStatus">admittedTerm-admn-sts</termNote>
        </tig>
        <tig>
            <term>equipo</term>
            <termNote type="partOfSpeech">noun</termNote>
            <termNote type="grammaticalGender">masculine</termNote>
            <termNote type="grammaticalNumber">singular</termNote>
            <termNote
type="administrativeStatus">preferredTerm-admn-sts</termNote>
        </tig>
        <tig>
            <term>PC</term>
            <termNote
type="administrativeStatus">admittedTerm-admn-sts</termNote>
            <termNote type="usageNote">Intentar no abusar de esta
tradución.</termNote>
        </tig>
        <tig>
            <term>ordenador</term>
            <termNote
type="administrativeStatus">deprecatedTerm-admn-sts</termNote>
        </tig>
        <tig>
            <term>computador</term>
            <termNote
type="administrativeStatus">deprecatedTerm-admn-sts</termNote>
        </tig>
        <tig>
            <term>computadora</term>
            <termNote
type="administrativeStatus">deprecatedTerm-admn-sts</termNote>
        </tig>
    </langSet>
    <langSet xml:lang="es">
       <xref type="xSource"
target="http://fr.wikipedia.org/wiki/Ordinateur";>Wikipedia</xref>
        <tig>
            <term>ordinateur</term>
            <termNote type="processStatus">provisionallyProcessed</termNote>
        </tig>
    </langSet>
</termEntry>

Here you have more info (in galician):
http://www.certima.net/glosima/?28-xustificacion-das-escollas-de

Maybe this message hasn't a good structure that may do difficult to
read it, but I doesn't have more time right now. Ask if you have
doubts, please.

>>> now supporting features like grouping concepts is not just about being
>>> able to read the TBX format. the terminology matching algorithms needs
>>> to change, and maybe also the UI for presenting terminology
>>> suggestions. that's what we'll need to work on before we can start
>>> defaulting to TBX for terminology extraction.
>>
>> Ok. At first Pootle only needs to show suggestions. If you are
>> translating a phrase that contains "view" Pootle should show all the
>> entries from the glossary that contain "view". In case of a glossary
>> made with one of these tools (remember concept based terminology)
>> should show at least two entries from TBX: "view" the verb and "view"
>> the noun.
>
> I agree with you, and Pootle has already done this for a long time -
> really useful.
>
>> I emphasize that Pootle should only show suggestions since it is a
>> program for translating. All it should have is a way to show
>> suggestions besides the translation fields and some interface for
>> managing glossaries (import and deleting, but nothing more).
>> Extracting terminology from a translations database and
>> adding/editing/deleting terms and reducing false positives generated
>> by poterminology and that things should be in an standalone tool
>> within the translation toolkit ecosystem (TTK, Virtaal, Pootle, and
>> now this one). IMHO this is the best way, but I realize that in the
>> beginning you may want to start including these features in Pootle and
>> maybe pull out them to other tool later.
>
> The existing versions of Pootle can already import and export it, since
> you can download the files, and upload the files.
>
> I think the reality is that a lot of people are not benefiting from
> terminology help as much as they can, because they don't use any
> terminology creation tools. I think the upcoming version of Pootle with
> this terminology extraction will go a long way to helping people do the
> right thing.
>
>>> now as you seem to not only know what advanced terminology support
>>> should look like but are also passionate about it, I hope we can
>>> convince you to write detailed description of the features you think we
>>> should support either on the wiki
>>> http://translate.sourceforge.net/wiki/ or as feature request in
>>> bugzilla http://bugs.locamotion.org/
>>
>> Ok. First I want to see how Pootle does right now, but I don't have a
>> plenty of time for investigating. When I have time I likely will write
>> the "specifications". As I said we are currently working in a tool to
>> manage glossaries. When it is finished we can use the ideas to rewrite
>> it using Python or Django or whatever you are using for Pootle, and
>> integrate on it the interface for managing the extraction using
>> poterminology and those things we are considerating.
>
> Some of the existing behaviour is mentioned here:
>
> http://translate.sourceforge.net/wiki/pootle/terminology_matching
>
> Of course, that doesn't cover the new features to do extraction.

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Translate-pootle mailing list
Translate-pootle@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/translate-pootle

Re: [translate-pootle] Pootle 2.1.0 Alpha1 snapshot release

Reply via email to