On 26 March 2012 13:49, Jacob Nordfalk <[email protected]> wrote:
> 2012/3/25 Mikel Artetxe <[email protected]>
>>> I think we should leave 'reducing start-up time' for now, as its not
>>> neccesarily a task that has anything to do with embedding. Sorry.
>>
>>
>> Well, maybe I can work on it, the only thing that I meant was that right
>> now I wouldn't really know how to accomplish it. And I don't want to accept
>> something unless I am confident that I can do it...
>
>
> Perhaps other mentors have an opinion on this:
>
> Is it OK to schedule 'investigation', 'work on', 'try to' stuff in a work
> plan without clear promises of working code?
>
> My opinion is: Yes, as long as there is some kind of deliverable, for
> example a written description on what was tried out and what was found out
> so others could use it as base for future work on that area.

It really depends what you're talking about.

In this case, it looks fine. In particular 'Work on a demo JAR
executable' is an intermediate step between the previous work and the
deliverable.

If, on the other hand, it was something like 'investigate the existing
lttoolbox-java API', that would be unacceptable - that's background
reading that should be done during the community bonding period.

As a rule of thumb, I would propose: if it can be known _before_ the
start of the project, it ought to be; if it can't, it can be part of
the project.

>> OK. This would be my very first draft of the work plan:
>>
>> Week 1-3: Adapt lttoolbox-java so that it can directly work with embedded
>> files without the need of copying them to a temporary directory.
>> Week 4: Make an API class that easily allows the translation of an
>> embedded language pair. Work on a demo JAR executable usable from the
>> command line that would make use of it with a specific language pair.
>
>
> I'd set some time aside to understand the MT engine. Read some articles,
> choose 2-3 language pairs, compile them, change them, see how stuff works
> under the hood. Add some words that are missing to a language pair of your
> choice.
> Use apertium-viewer or something else that lets you see what happens at each
> stage.
>

The stuff you mention here is stuff for the community bonding period.

>
>>
>> Deliverable #1: The above mentioned JAR executable.
>> Week 5: Work on an API class that allows access to the intermediary
>> translation stages.
>> Week 6: Make a small user-oriented Swing application for a specific
>> language pair (something similar to apertium-tolk). The idea is that any
>> user with the only prerequisite of having JVM installed could download and
>> run it by just double-clicking it.
>> Week 7-8: Adapt and extend the previous application so that it can work
>> with several language pairs. This could be achieved by having a JAR per
>> language pair and the main JAR executable that makes use of them or by
>> integrating everything on a single JAR executable that is able to modify
>> itself in order to add or remove language pairs from it (I am not sure about
>> which approach would be better).
>
>
> I'd prefer to avoid modifying JAR files.
> It's a ZIP file. If people wants several pairs in the same ZIP file they can
> use ZIP.
>
> I'd prefer one JAR per language pair and I'd prefer that it would be
> directly executable as-is from the command line.
> We have already established that it only add 200kb to the JAR. If we embed a
> small swing app, like Apertium-tolk then it should be really small and
> simple, of course.
>
> People that doesent want to the extra bloat or want to collect several pairs
> in one JAR can use a ZIP tool to remove bloat or collect pairs. JAR file
> structure should support this, of course.
>

If it was limited to build time support, and if the build could be
made to support building any of
1. engine + language pair
2. language pair only
3. engine + multiple language pairs

then that's something that might be worth looking into -- to tie in
with your earlier question, this is the sort of investigation that
should happen during the project.

If someone wanted to distribute a jar file with all pairs featuring,
say, Galician, and that could be supported as a build option, then
great, but it's not worth messing around with jar files to get that
option from pre-built packages.


-- 
<Sefam> Are any of the mentors around?
<jimregan> yes, they're the ones trolling you

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to