[jira] [Commented] (JOSHUA-335) Consider using thrax2

2018-09-06 Thread Lewis John McGibbney (JIRA)


[ 
https://issues.apache.org/jira/browse/JOSHUA-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606589#comment-16606589
 ] 

Lewis John McGibbney commented on JOSHUA-335:
-

[~mjwall]

bq. Either by replacing it in the pipeline.pl

this is the quickest solution

bq. or reworking the pipeline as I have seen discussed in other tickets.

This is what needs done. It is hellish and I think we should invest a GSoC 
project in trying to achieve it. Can you reference the ticket here please if 
there is one?

> Consider using thrax2
> -
>
> Key: JOSHUA-335
> URL: https://issues.apache.org/jira/browse/JOSHUA-335
> Project: Joshua
>  Issue Type: Improvement
>  Components: thrax
>Reporter: Michael Wall
>Priority: Minor
>
> Ran across this https://github.com/jweese/thrax2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (JOSHUA-335) Consider using thrax2

2018-09-06 Thread Lewis John McGibbney (JIRA)


[ 
https://issues.apache.org/jira/browse/JOSHUA-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606465#comment-16606465
 ] 

Lewis John McGibbney commented on JOSHUA-335:
-

[~mjwall] this is interesting. IIRC Thrax is always used when building LM's in 
Joshua. Are you planning on taking this on?

> Consider using thrax2
> -
>
> Key: JOSHUA-335
> URL: https://issues.apache.org/jira/browse/JOSHUA-335
> Project: Joshua
>  Issue Type: Improvement
>  Components: thrax
>Reporter: Michael Wall
>Priority: Minor
>
> Ran across this https://github.com/jweese/thrax2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (JOSHUA-334) Update Homebrew Formular with all language pack options

2018-02-02 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351304#comment-16351304
 ] 

Lewis John McGibbney commented on JOSHUA-334:
-

Progress can be seen at 
https://github.com/lewismc/homebrew-core/tree/joshua_language_packs
Still lots of SHA256 calculation and remote URL resolution fir dropbox but we 
are getting there.

> Update Homebrew Formular with all language pack options
> ---
>
> Key: JOSHUA-334
> URL: https://issues.apache.org/jira/browse/JOSHUA-334
> Project: Joshua
>  Issue Type: Improvement
>  Components: homebrew-formula, language packs
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Major
> Fix For: 6.2
>
>
> When I originally wrote the [Homebrew 
> Formula|https://github.com/Homebrew/homebrew-core/blob/00eea5b204b069416142352ca24314c024f5d6c7/Formula/joshua.rb#L18-L20],
>  I added options for installing the old *with-es-en-phrase-pack*, 
> *with-ar-en-phrase-pack* and *with-zh-en-hiero-pack* language packs.
> Back then, these were staged on Matt's server at Johns Hopkin but they have 
> since been relocated to Tom's dropbox. Additionally, we now have a wealth of 
> other language packs which are not currently available through the Formula.
> This issue is pretty large in scope, but in essence will update the Formula 
> to provide options for installing [all of our language 
> packs|https://cwiki.apache.org/confluence/display/JOSHUA/Language+Packs].
> Once this is done, it will be very powerful and extremely useful tooling for 
> Joshua.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (JOSHUA-328) failure when glue grammar is listed first

2018-02-02 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-328:

Fix Version/s: (was: 6.1)
   6.2

> failure when glue grammar is listed first
> -
>
> Key: JOSHUA-328
> URL: https://issues.apache.org/jira/browse/JOSHUA-328
> Project: Joshua
>  Issue Type: Bug
>Affects Versions: 6.1
>Reporter: Matt Post
>Priority: Major
> Fix For: 6.2
>
>
> If doing CKY-decoding (-search cky), listing the glue grammar before the 
> packed grammar results in a parsing failure. E.g., the following lines in the 
> config file:
> tm = thrax -maxspan -1 -owner glue -path model/glue.grammar
> tm = thrax -maxspan 20 -path model/grammar.packed -owner pt
> will result in failed decoding every time, and a printing of the following 
> error message:
> ERROR - the goal_bin does not have exactly one item



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (JOSHUA-332) Merge 7 branch into master

2018-01-10 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16320535#comment-16320535
 ] 

Lewis John McGibbney commented on JOSHUA-332:
-

If this is an entire PITA I would just leave it and close it as not an issue. 

>  Merge 7 branch into master
> ---
>
> Key: JOSHUA-332
> URL: https://issues.apache.org/jira/browse/JOSHUA-332
> Project: Joshua
>  Issue Type: Task
>  Components: core
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 7
>
>
> As discussed on the mailing list, let's branch _master_ into a _6x_ branch 
> and merge branch _7_ into _master_ in order to keep developing on top of the 
> latest in the main branch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (JOSHUA-333) The English-English Language Pack download links are broken.

2018-01-05 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313425#comment-16313425
 ] 

Lewis John McGibbney commented on JOSHUA-333:
-

[~bugg_tb] were these files copied when we migrated from [~post]'s server to 
Dropbox?

> The English-English Language Pack download links are broken.
> 
>
> Key: JOSHUA-333
> URL: https://issues.apache.org/jira/browse/JOSHUA-333
> Project: Joshua
>  Issue Type: Bug
>Reporter: David Gonzalez
>
> On the Apache Joshua English-English wiki page the ruleset (PPDB v2) 
> downloads are all broken (404).
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65142863



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (JOSHUA-332) Merge 7 branch into master

2017-10-26 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16220815#comment-16220815
 ] 

Lewis John McGibbney commented on JOSHUA-332:
-

Damn Tommaso. Is there still a lot of work to do?

>  Merge 7 branch into master
> ---
>
> Key: JOSHUA-332
> URL: https://issues.apache.org/jira/browse/JOSHUA-332
> Project: Joshua
>  Issue Type: Task
>  Components: core
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 7
>
>
> As discussed on the mailing list, let's branch _master_ into a _6x_ branch 
> and merge branch _7_ into _master_ in order to keep developing on top of the 
> latest in the main branch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (JOSHUA-332) Merge 7 branch into master

2017-10-25 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219517#comment-16219517
 ] 

Lewis John McGibbney commented on JOSHUA-332:
-

[~teofili] I see that your recent [link mailing list 
discussion|https://lists.apache.org/thread.html/b43cdffd8f3ea7b7c70929eed4aaa989af31bcdc5b5e8320ff412dd4@%3Cdev.joshua.apache.org%3E]
 may have not been resolved yet. Is this preventing the replacement of current 
master with 7 branch?
Thanks

>  Merge 7 branch into master
> ---
>
> Key: JOSHUA-332
> URL: https://issues.apache.org/jira/browse/JOSHUA-332
> Project: Joshua
>  Issue Type: Task
>  Components: core
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 7
>
>
> As discussed on the mailing list, let's branch _master_ into a _6x_ branch 
> and merge branch _7_ into _master_ in order to keep developing on top of the 
> latest in the main branch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (JOSHUA-324) Address Apache Joshua 6.1 RC#2 Issues

2017-02-21 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876442#comment-15876442
 ] 

Lewis John McGibbney commented on JOSHUA-324:
-

[~teofili] yes thank you very much, please do.

> Address Apache Joshua 6.1 RC#2 Issues
> -
>
> Key: JOSHUA-324
> URL: https://issues.apache.org/jira/browse/JOSHUA-324
> Project: Joshua
>  Issue Type: Task
>Affects Versions: 6.1
>Reporter: Lewis John McGibbney
>Assignee: Tommaso Teofili
>Priority: Blocker
> Fix For: 6.1
>
>
> Feedback from [~jmclean] (thank you Justin) on our RC#2 is as follows
> {code}
> ==
> - Your missing incubating in the release artifacts name. [1]
> - There are a number of binary files in the source release that look to be
> compiled source code.
> I checked:
> - name doesn’t include incubating
> - signatures and hashes correct
> - DISCLAIMER exists
> - LICENSE is missing a few things (see below)
> - a source file is missing an Apache header [7]
> - Several unexpected binary files are contained in the source release
> [8][9][10][11]
> - Can compile from source
> License is missing:
> - MIT licensed normalize.css v3.0.3 bundled in [5]
> - glyph icon fonts [6]
> Not an issue but it's a little odd to have LICENSE and NOTICE.txt - usually
> both are bare or both have .txt extension.
> Also while looking at your site I noticed that the download links of you
> incubating site [2] points to github, please change to point to the offical
> release area.
> Also the 6.1 release has already been tagged and it available for public
> download on github [4]  before this vote is finished. This is IMO against
> Apache release policy [3] please remove.
> I also notice you recently released the language packs (18th Nov) but there
> doesn’t seem to have been a vote for that? Any reason for this?
> ===
> [1] http://incubator.apache.org/incubation/Incubation_Policy.html#Releases
> [2] 
> https://cwiki.apache.org/confluence/display/JOSHUA/Apache+Joshua+%28Incubating%29+Home
> [3] http://www.apache.org/dev/release.html#what
> [4] https://github.com/apache/incubator-joshua/releases
> [5] ./demo/bootstrap/css/bootstrap.min.css
> [6] apache-joshua-6.1/demo/bootstrap/fonts/*
> [7] ./src/test/java/org/apache/joshua/decoder/ff/tm/OwnerMapTest.java
> [8] ./bin/GIZA++
> [9] ./bin/mkcls
> [10 ]./bin/snt2cooc.out
> [11] ,/src/test/resources/berkeley_lm/lm.berkeleylm.gz
> [12] http://www.mail-archive.com/general%40incubator.apache.org/msg57543.html
> [13] http://www.mail-archive.com/general%40incubator.apache.org/msg57551.html
> {code}
> This is a blocking issue and until addressed we cannot release 6.1-incubating



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (JOSHUA-324) Address Apache Joshua 6.1 RC#2 Issues

2017-01-25 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838087#comment-15838087
 ] 

Lewis John McGibbney commented on JOSHUA-324:
-

[~post] the only pending issue is the mvn assembly issue I described at 
http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg02023.html
I'll have a crack today and try to resolve it.

> Address Apache Joshua 6.1 RC#2 Issues
> -
>
> Key: JOSHUA-324
> URL: https://issues.apache.org/jira/browse/JOSHUA-324
> Project: Joshua
>  Issue Type: Task
>Affects Versions: 6.1
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> Feedback from [~jmclean] (thank you Justin) on our RC#2 is as follows
> {code}
> ==
> - Your missing incubating in the release artifacts name. [1]
> - There are a number of binary files in the source release that look to be
> compiled source code.
> I checked:
> - name doesn’t include incubating
> - signatures and hashes correct
> - DISCLAIMER exists
> - LICENSE is missing a few things (see below)
> - a source file is missing an Apache header [7]
> - Several unexpected binary files are contained in the source release
> [8][9][10][11]
> - Can compile from source
> License is missing:
> - MIT licensed normalize.css v3.0.3 bundled in [5]
> - glyph icon fonts [6]
> Not an issue but it's a little odd to have LICENSE and NOTICE.txt - usually
> both are bare or both have .txt extension.
> Also while looking at your site I noticed that the download links of you
> incubating site [2] points to github, please change to point to the offical
> release area.
> Also the 6.1 release has already been tagged and it available for public
> download on github [4]  before this vote is finished. This is IMO against
> Apache release policy [3] please remove.
> I also notice you recently released the language packs (18th Nov) but there
> doesn’t seem to have been a vote for that? Any reason for this?
> ===
> [1] http://incubator.apache.org/incubation/Incubation_Policy.html#Releases
> [2] 
> https://cwiki.apache.org/confluence/display/JOSHUA/Apache+Joshua+%28Incubating%29+Home
> [3] http://www.apache.org/dev/release.html#what
> [4] https://github.com/apache/incubator-joshua/releases
> [5] ./demo/bootstrap/css/bootstrap.min.css
> [6] apache-joshua-6.1/demo/bootstrap/fonts/*
> [7] ./src/test/java/org/apache/joshua/decoder/ff/tm/OwnerMapTest.java
> [8] ./bin/GIZA++
> [9] ./bin/mkcls
> [10 ]./bin/snt2cooc.out
> [11] ,/src/test/resources/berkeley_lm/lm.berkeleylm.gz
> [12] http://www.mail-archive.com/general%40incubator.apache.org/msg57543.html
> [13] http://www.mail-archive.com/general%40incubator.apache.org/msg57551.html
> {code}
> This is a blocking issue and until addressed we cannot release 6.1-incubating



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-324) Address Apache Joshua 6.1 RC#2 Issues

2017-01-17 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827360#comment-15827360
 ] 

Lewis John McGibbney commented on JOSHUA-324:
-

I'll be finishing my QA and producing an RC#3 tomorrow folks. Thanks.
I've just committed 
{code}
commit ae755a8bc0b1de9475285fcc8d35d8a8b5f00a6f
Author: Lewis John McGibbney 
Date:   Tue Jan 17 19:12:10 2017 -0800

JOSHUA-324 Address Apache Joshua 6.1 RC#2 Issues
{code}

> Address Apache Joshua 6.1 RC#2 Issues
> -
>
> Key: JOSHUA-324
> URL: https://issues.apache.org/jira/browse/JOSHUA-324
> Project: Joshua
>  Issue Type: Task
>Affects Versions: 6.1
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> Feedback from [~jmclean] (thank you Justin) on our RC#2 is as follows
> {code}
> ==
> - Your missing incubating in the release artifacts name. [1]
> - There are a number of binary files in the source release that look to be
> compiled source code.
> I checked:
> - name doesn’t include incubating
> - signatures and hashes correct
> - DISCLAIMER exists
> - LICENSE is missing a few things (see below)
> - a source file is missing an Apache header [7]
> - Several unexpected binary files are contained in the source release
> [8][9][10][11]
> - Can compile from source
> License is missing:
> - MIT licensed normalize.css v3.0.3 bundled in [5]
> - glyph icon fonts [6]
> Not an issue but it's a little odd to have LICENSE and NOTICE.txt - usually
> both are bare or both have .txt extension.
> Also while looking at your site I noticed that the download links of you
> incubating site [2] points to github, please change to point to the offical
> release area.
> Also the 6.1 release has already been tagged and it available for public
> download on github [4]  before this vote is finished. This is IMO against
> Apache release policy [3] please remove.
> I also notice you recently released the language packs (18th Nov) but there
> doesn’t seem to have been a vote for that? Any reason for this?
> ===
> [1] http://incubator.apache.org/incubation/Incubation_Policy.html#Releases
> [2] 
> https://cwiki.apache.org/confluence/display/JOSHUA/Apache+Joshua+%28Incubating%29+Home
> [3] http://www.apache.org/dev/release.html#what
> [4] https://github.com/apache/incubator-joshua/releases
> [5] ./demo/bootstrap/css/bootstrap.min.css
> [6] apache-joshua-6.1/demo/bootstrap/fonts/*
> [7] ./src/test/java/org/apache/joshua/decoder/ff/tm/OwnerMapTest.java
> [8] ./bin/GIZA++
> [9] ./bin/mkcls
> [10 ]./bin/snt2cooc.out
> [11] ,/src/test/resources/berkeley_lm/lm.berkeleylm.gz
> [12] http://www.mail-archive.com/general%40incubator.apache.org/msg57543.html
> [13] http://www.mail-archive.com/general%40incubator.apache.org/msg57551.html
> {code}
> This is a blocking issue and until addressed we cannot release 6.1-incubating



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-324) Address Apache Joshua 6.1 RC#2 Issues

2016-11-29 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15706577#comment-15706577
 ] 

Lewis John McGibbney commented on JOSHUA-324:
-

Hi Folks, I've assigned this to myself and will begin working on a pull request 
to incrementally address the above issues.

> Address Apache Joshua 6.1 RC#2 Issues
> -
>
> Key: JOSHUA-324
> URL: https://issues.apache.org/jira/browse/JOSHUA-324
> Project: Joshua
>  Issue Type: Task
>Affects Versions: 6.1
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> Feedback from [~jmclean] (thank you Justin) on our RC#2 is as follows
> {code}
> ==
> - Your missing incubating in the release artifacts name. [1]
> - There are a number of binary files in the source release that look to be
> compiled source code.
> I checked:
> - name doesn’t include incubating
> - signatures and hashes correct
> - DISCLAIMER exists
> - LICENSE is missing a few things (see below)
> - a source file is missing an Apache header [7]
> - Several unexpected binary files are contained in the source release
> [8][9][10][11]
> - Can compile from source
> License is missing:
> - MIT licensed normalize.css v3.0.3 bundled in [5]
> - glyph icon fonts [6]
> Not an issue but it's a little odd to have LICENSE and NOTICE.txt - usually
> both are bare or both have .txt extension.
> Also while looking at your site I noticed that the download links of you
> incubating site [2] points to github, please change to point to the offical
> release area.
> Also the 6.1 release has already been tagged and it available for public
> download on github [4]  before this vote is finished. This is IMO against
> Apache release policy [3] please remove.
> I also notice you recently released the language packs (18th Nov) but there
> doesn’t seem to have been a vote for that? Any reason for this?
> ===
> [1] http://incubator.apache.org/incubation/Incubation_Policy.html#Releases
> [2] 
> https://cwiki.apache.org/confluence/display/JOSHUA/Apache+Joshua+%28Incubating%29+Home
> [3] http://www.apache.org/dev/release.html#what
> [4] https://github.com/apache/incubator-joshua/releases
> [5] ./demo/bootstrap/css/bootstrap.min.css
> [6] apache-joshua-6.1/demo/bootstrap/fonts/*
> [7] ./src/test/java/org/apache/joshua/decoder/ff/tm/OwnerMapTest.java
> [8] ./bin/GIZA++
> [9] ./bin/mkcls
> [10 ]./bin/snt2cooc.out
> [11] ,/src/test/resources/berkeley_lm/lm.berkeleylm.gz
> [12] http://www.mail-archive.com/general%40incubator.apache.org/msg57543.html
> [13] http://www.mail-archive.com/general%40incubator.apache.org/msg57551.html
> {code}
> This is a blocking issue and until addressed we cannot release 6.1-incubating



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (JOSHUA-324) Address Apache Joshua 6.1 RC#2 Issues

2016-11-29 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reassigned JOSHUA-324:
---

Assignee: Lewis John McGibbney

> Address Apache Joshua 6.1 RC#2 Issues
> -
>
> Key: JOSHUA-324
> URL: https://issues.apache.org/jira/browse/JOSHUA-324
> Project: Joshua
>  Issue Type: Task
>Affects Versions: 6.1
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> Feedback from [~jmclean] (thank you Justin) on our RC#2 is as follows
> {code}
> ==
> - Your missing incubating in the release artifacts name. [1]
> - There are a number of binary files in the source release that look to be
> compiled source code.
> I checked:
> - name doesn’t include incubating
> - signatures and hashes correct
> - DISCLAIMER exists
> - LICENSE is missing a few things (see below)
> - a source file is missing an Apache header [7]
> - Several unexpected binary files are contained in the source release
> [8][9][10][11]
> - Can compile from source
> License is missing:
> - MIT licensed normalize.css v3.0.3 bundled in [5]
> - glyph icon fonts [6]
> Not an issue but it's a little odd to have LICENSE and NOTICE.txt - usually
> both are bare or both have .txt extension.
> Also while looking at your site I noticed that the download links of you
> incubating site [2] points to github, please change to point to the offical
> release area.
> Also the 6.1 release has already been tagged and it available for public
> download on github [4]  before this vote is finished. This is IMO against
> Apache release policy [3] please remove.
> I also notice you recently released the language packs (18th Nov) but there
> doesn’t seem to have been a vote for that? Any reason for this?
> ===
> [1] http://incubator.apache.org/incubation/Incubation_Policy.html#Releases
> [2] 
> https://cwiki.apache.org/confluence/display/JOSHUA/Apache+Joshua+%28Incubating%29+Home
> [3] http://www.apache.org/dev/release.html#what
> [4] https://github.com/apache/incubator-joshua/releases
> [5] ./demo/bootstrap/css/bootstrap.min.css
> [6] apache-joshua-6.1/demo/bootstrap/fonts/*
> [7] ./src/test/java/org/apache/joshua/decoder/ff/tm/OwnerMapTest.java
> [8] ./bin/GIZA++
> [9] ./bin/mkcls
> [10 ]./bin/snt2cooc.out
> [11] ,/src/test/resources/berkeley_lm/lm.berkeleylm.gz
> [12] http://www.mail-archive.com/general%40incubator.apache.org/msg57543.html
> [13] http://www.mail-archive.com/general%40incubator.apache.org/msg57551.html
> {code}
> This is a blocking issue and until addressed we cannot release 6.1-incubating



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-324) Address Apache Joshua 6.1 RC#2 Issues

2016-11-29 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-324:
---

 Summary: Address Apache Joshua 6.1 RC#2 Issues
 Key: JOSHUA-324
 URL: https://issues.apache.org/jira/browse/JOSHUA-324
 Project: Joshua
  Issue Type: Task
Affects Versions: 6.1
Reporter: Lewis John McGibbney
Priority: Blocker
 Fix For: 6.1


Feedback from [~jmclean] (thank you Justin) on our RC#2 is as follows
{code}
==
- Your missing incubating in the release artifacts name. [1]
- There are a number of binary files in the source release that look to be
compiled source code.

I checked:
- name doesn’t include incubating
- signatures and hashes correct
- DISCLAIMER exists
- LICENSE is missing a few things (see below)
- a source file is missing an Apache header [7]
- Several unexpected binary files are contained in the source release
[8][9][10][11]
- Can compile from source

License is missing:
- MIT licensed normalize.css v3.0.3 bundled in [5]
- glyph icon fonts [6]

Not an issue but it's a little odd to have LICENSE and NOTICE.txt - usually
both are bare or both have .txt extension.

Also while looking at your site I noticed that the download links of you
incubating site [2] points to github, please change to point to the offical
release area.
Also the 6.1 release has already been tagged and it available for public
download on github [4]  before this vote is finished. This is IMO against
Apache release policy [3] please remove.

I also notice you recently released the language packs (18th Nov) but there
doesn’t seem to have been a vote for that? Any reason for this?
===

[1] http://incubator.apache.org/incubation/Incubation_Policy.html#Releases
[2] 
https://cwiki.apache.org/confluence/display/JOSHUA/Apache+Joshua+%28Incubating%29+Home
[3] http://www.apache.org/dev/release.html#what
[4] https://github.com/apache/incubator-joshua/releases
[5] ./demo/bootstrap/css/bootstrap.min.css
[6] apache-joshua-6.1/demo/bootstrap/fonts/*
[7] ./src/test/java/org/apache/joshua/decoder/ff/tm/OwnerMapTest.java
[8] ./bin/GIZA++
[9] ./bin/mkcls
[10 ]./bin/snt2cooc.out
[11] ,/src/test/resources/berkeley_lm/lm.berkeleylm.gz
[12] http://www.mail-archive.com/general%40incubator.apache.org/msg57543.html
[13] http://www.mail-archive.com/general%40incubator.apache.org/msg57551.html
{code}
This is a blocking issue and until addressed we cannot release 6.1-incubating



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-315) Thrax keeps all rules

2016-11-22 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-315:

Fix Version/s: (was: 6.2)
   6.1

> Thrax keeps all rules
> -
>
> Key: JOSHUA-315
> URL: https://issues.apache.org/jira/browse/JOSHUA-315
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
> Fix For: 6.1
>
>
> When extracting rules, Thrax keeps *all* options for each target side. For 
> large bitexts and common source sides (e.g., "de" for Spanish–English), there 
> can be tens of thousands of translations, due to errors in the alignments and 
> phenomena like garbage collection. The decoder throws out all but the top 
> num_translation_options of these (default 20), but before doing so, it has to 
> score all the target side options with all feature functions, include the 
> language model. This slows down "warming up" of the model and means that the 
> first sentences to use these items are very slow to translation.
> I have updated scripts/training/filter-rules.pl to filter out using Thrax's 
> rarity penalty field, but it would be much better if Thrax were to keep only 
> the most 100 frequent translation options for each source side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-316) run_bundler.py returning JOB FAILED (return code 1) TypeError: memoryview: a bytes-like object is required, not 'str'

2016-11-22 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-316:

Fix Version/s: (was: 6.2)
   6.1

> run_bundler.py returning JOB FAILED (return code 1) TypeError: memoryview: a 
> bytes-like object is required, not 'str'
> -
>
> Key: JOSHUA-316
> URL: https://issues.apache.org/jira/browse/JOSHUA-316
> Project: Joshua
>  Issue Type: Bug
>  Components: bundler
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Critical
> Fix For: 6.1
>
>
> {code}
> [glue-tune] rebuilding...
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/create_glue_grammar.sh 
> /usr/local/joshua_resources/russian_experiments/exp2/grammar.packed > 
> /usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>   took 1 seconds (1s)
> [tune-bundle] rebuilding...
>   
> dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/tune/model/run-joshua.sh
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
> --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
> /usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> /usr/local/joshua_resources/russian_experiments/exp2/tune/model 
> --copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
> -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
> tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
> "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
> /usr/local/joshua_resources/russian_experiments/exp2/grammar.packed --tm 
> /usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>   JOB FAILED (return code 1)
> * Running the copy-config.pl script with the command: 
> /usr/local/incubator-joshua/scripts/copy-config.pl -top-n 300 -output-format 
> "%i ||| %s ||| %f ||| %c" -mark-oovs false -search cky -weights "lm_0 1 
> tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " 
> -feature-function "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue
> Traceback (most recent call last):
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 748, in main
> operations = collect_operations(opts)
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 637, in collect_operations
> opts.copy_config_options
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 202, in filter_through_copy_config_script
> result, err = p.communicate(config_text)
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1072, 
> in communicate
> stdout, stderr = self._communicate(input, endtime, timeout)
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1700, 
> in _communicate
> input_view = memoryview(self._input)
> TypeError: memoryview: a bytes-like object is required, not 'str'
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 760, in 
> main(sys.argv)
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 751, in main
> error_quit(e.message)
> AttributeError: 'TypeError' object has no attribute 'message'
> * WARNING: no key 'outputformat' found in config file (appending to end)
> * WARNING: no key 'search' found in config file (appending to end)
> * WARNING: no key 'topn' found in config file (appending to end)
> * WARNING: no key 'markoovs' found in config file (appending to end)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (JOSHUA-317) SyntaxError: invalid syntax scripts/training/run_tuner.py", line 391

2016-11-22 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved JOSHUA-317.
-
Resolution: Fixed

> SyntaxError: invalid syntax scripts/training/run_tuner.py", line 391
> 
>
> Key: JOSHUA-317
> URL: https://issues.apache.org/jira/browse/JOSHUA-317
> Project: Joshua
>  Issue Type: Bug
>  Components: tuner
>Affects Versions: 6.0.5
> Environment: Python 3.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
>
> {code}
> [tune-bundle] rebuilding...
>   
> dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/model/run-joshua.sh
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
> --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
> /usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/model 
> --copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
> -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
> tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
> "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp3/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
> /usr/local/joshua_resources/russian_experiments/exp3/grammar.packed --tm 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/grammar.glue
>   took 0 seconds (0s)
> [mert-1] rebuilding...
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
> [CHANGED]
>   dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> [CHANGED]
>   dep=tune/model/grammar.packed/slice_0.source [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru 
> --tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner 
> mert --decoder 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command 
> --decoder-config 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> --decoder-output-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest 
> --decoder-log-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log 
> --iterations 10 --metric 'BLEU 4 closest'
>   JOB FAILED (return code 1)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 391
> 'ITERATIONS': `iterations`,
>   ^
> SyntaxError: invalid syntax
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (JOSHUA-316) run_bundler.py returning JOB FAILED (return code 1) TypeError: memoryview: a bytes-like object is required, not 'str'

2016-11-22 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved JOSHUA-316.
-
Resolution: Fixed

> run_bundler.py returning JOB FAILED (return code 1) TypeError: memoryview: a 
> bytes-like object is required, not 'str'
> -
>
> Key: JOSHUA-316
> URL: https://issues.apache.org/jira/browse/JOSHUA-316
> Project: Joshua
>  Issue Type: Bug
>  Components: bundler
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Critical
> Fix For: 6.1
>
>
> {code}
> [glue-tune] rebuilding...
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/create_glue_grammar.sh 
> /usr/local/joshua_resources/russian_experiments/exp2/grammar.packed > 
> /usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>   took 1 seconds (1s)
> [tune-bundle] rebuilding...
>   
> dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/tune/model/run-joshua.sh
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
> --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
> /usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> /usr/local/joshua_resources/russian_experiments/exp2/tune/model 
> --copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
> -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
> tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
> "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
> /usr/local/joshua_resources/russian_experiments/exp2/grammar.packed --tm 
> /usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>   JOB FAILED (return code 1)
> * Running the copy-config.pl script with the command: 
> /usr/local/incubator-joshua/scripts/copy-config.pl -top-n 300 -output-format 
> "%i ||| %s ||| %f ||| %c" -mark-oovs false -search cky -weights "lm_0 1 
> tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " 
> -feature-function "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue
> Traceback (most recent call last):
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 748, in main
> operations = collect_operations(opts)
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 637, in collect_operations
> opts.copy_config_options
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 202, in filter_through_copy_config_script
> result, err = p.communicate(config_text)
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1072, 
> in communicate
> stdout, stderr = self._communicate(input, endtime, timeout)
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1700, 
> in _communicate
> input_view = memoryview(self._input)
> TypeError: memoryview: a bytes-like object is required, not 'str'
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 760, in 
> main(sys.argv)
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 751, in main
> error_quit(e.message)
> AttributeError: 'TypeError' object has no attribute 'message'
> * WARNING: no key 'outputformat' found in config file (appending to end)
> * WARNING: no key 'search' found in config file (appending to end)
> * WARNING: no key 'topn' found in config file (appending to end)
> * WARNING: no key 'markoovs' found in config file (appending to end)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-290) Provide Joshua artifact as a bundle

2016-11-14 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-290:

Fix Version/s: 6.2

> Provide Joshua artifact as a bundle
> ---
>
> Key: JOSHUA-290
> URL: https://issues.apache.org/jira/browse/JOSHUA-290
> Project: Joshua
>  Issue Type: Task
>  Components: build
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 6.2
>
>
> I think it'd be good if we could make the Joshua artifact an OSGi _bundle_.
> This would have no impact on plain java applications but would give the 
> following benefits:
> - make it possible to install it in OSGi environments
> - optionally introduce semantic versioning (in addition with the baseline 
> plugin) that would help track e.g. if changes in APIs break backward 
> compatibility 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-51) add jhclark/bigfatlm

2016-11-14 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-51?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-51:
---
Fix Version/s: 6.1

> add jhclark/bigfatlm
> 
>
> Key: JOSHUA-51
> URL: https://issues.apache.org/jira/browse/JOSHUA-51
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Matt Post
> Fix For: 6.2
>
>
> It would be nice to leverage more Hadoop tools in the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-314) Enable set structured-output from config file

2016-11-14 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-314:

Fix Version/s: 6.2

> Enable set structured-output from config file
> -
>
> Key: JOSHUA-314
> URL: https://issues.apache.org/jira/browse/JOSHUA-314
> Project: Joshua
>  Issue Type: Improvement
>  Components: core
>Reporter: Tommaso Teofili
> Fix For: 6.2
>
>
> Currently if one sets _use-structured-output = true_ in joshua.config that 
> results in error when parsing the config as it's not explicitly handled by 
> {{JoshuaConfiguration#readConfig}} (it can only be set programmatically), I 
> think it'd be nice to be able to configure it from config file too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-51) add jhclark/bigfatlm

2016-11-14 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-51?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-51:
---
Fix Version/s: (was: 6.1)
   6.2

> add jhclark/bigfatlm
> 
>
> Key: JOSHUA-51
> URL: https://issues.apache.org/jira/browse/JOSHUA-51
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Matt Post
> Fix For: 6.2
>
>
> It would be nice to leverage more Hadoop tools in the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (JOSHUA-323) Joshua 6.1 Release Management

2016-11-14 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved JOSHUA-323.
-
Resolution: Fixed

> Joshua 6.1 Release Management
> -
>
> Key: JOSHUA-323
> URL: https://issues.apache.org/jira/browse/JOSHUA-323
> Project: Joshua
>  Issue Type: Task
>  Components: build, release
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> This is a governing ticket for reference more than anything else. We need to 
> add all release specific build additions to parent pom.xml which enable us to 
> roll a release candidate.
> The process is also being documented over at 
> https://cwiki.apache.org/confluence/display/JOSHUA/Joshua+Release+Management+Procedure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-323) Joshua 6.1 Release Management

2016-11-11 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656783#comment-15656783
 ] 

Lewis John McGibbney commented on JOSHUA-323:
-

All licensing is now addressed and merged into master. I have some work to do 
with regards to release packaging which is not quite up to scratch but I will 
work on that tomorrow.

> Joshua 6.1 Release Management
> -
>
> Key: JOSHUA-323
> URL: https://issues.apache.org/jira/browse/JOSHUA-323
> Project: Joshua
>  Issue Type: Task
>  Components: build, release
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> This is a governing ticket for reference more than anything else. We need to 
> add all release specific build additions to parent pom.xml which enable us to 
> roll a release candidate.
> The process is also being documented over at 
> https://cwiki.apache.org/confluence/display/JOSHUA/Joshua+Release+Management+Procedure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-323) Joshua 6.1 Release Management

2016-11-10 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656193#comment-15656193
 ] 

Lewis John McGibbney commented on JOSHUA-323:
-

Progress going well. RAT license headers are taking a wee while but will have 
them cracked for tomorrow. Following files are outstanding.
Progress can be tracked over on 
https://github.com/apache/incubator-joshua/pull/76
{code}
Files with unapproved licenses:

  scripts/analysis/sentence-by-sentence.pl
  scripts/analysis/tree_visualizer
  scripts/copy-config.pl
  scripts/distributedLM/config.template
  scripts/distributedLM/create_remote_sym_tbl.pl
  scripts/distributedLM/filter_lm.pl
  scripts/distributedLM/get_grammar_eng_voc.pl
  scripts/distributedLM/get_grammar_eng_voc_from_cn_voc.pl
  scripts/distributedLM/global_symol_list
  scripts/distributedLM/lm.list.withweights
  scripts/ems/config.ghkm
  scripts/ems/config.hiero
  scripts/ems/config.phrase
  scripts/ems/experiment.meta
  scripts/language-pack/build_lp.sh
  scripts/language-pack/README.template
  scripts/misc/canonical_path
  scripts/misc/iso639
  scripts/preparation/detokenize.pl
  scripts/preparation/lowercase.pl
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.ca
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.cs
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.de
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.el
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.en
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.es
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.fr
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.hu
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.is
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.it
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.lv
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.nl
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.pl
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.pt
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.ro
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.ru
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.sk
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.sl
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.sv
  scripts/preparation/normalize.pl
  scripts/preparation/tokenize.pl
  scripts/support/bbn2plf.pl
  scripts/support/extract-1best
  scripts/support/grammar-packer.pl
  scripts/support/moses2joshua.pl
  scripts/support/moses2joshua_grammar.pl
  scripts/support/phrase2hiero.py
  scripts/support/score-hypothesis.pl
  scripts/support/split2files
  scripts/training/add-OOVs.pl
  scripts/training/build-vocab.pl
  scripts/training/cachepipe/bashrc
  scripts/training/cachepipe/CachePipe.pm
  scripts/training/filter-empty-lines.pl
  scripts/training/filter-rules.pl
  scripts/training/get_grammar_features.pl
  scripts/training/lowercase-leaves.pl
  scripts/training/mira/feature_label_munger.pl
  scripts/training/mira/run-mira.pl
  scripts/training/paralign.pl
  scripts/training/parallelize/LocalConfig.pm
  scripts/training/parallelize/Makefile
  scripts/training/parallelize/parallelize.pl
  scripts/training/parallelize/sentclient.c
  scripts/training/parallelize/sentserver.c
  scripts/training/parallelize/sentserver.h
  scripts/training/paste
  scripts/training/run-giza.pl
  scripts/training/scat
  scripts/training/summarize.pl
  scripts/training/templates/alignment/jacana/resources/model/tagdict
  scripts/training/templates/alignment/word-align.conf
  scripts/training/templates/glue-grammar
  scripts/training/templates/glue-grammar.itg
  scripts/training/templates/hadoop/core-site.xml
  scripts/training/templates/hadoop/hdfs-site.xml
  scripts/training/templates/hadoop/mapred-site.xml
  scripts/training/templates/hadoop/masters
  scripts/training/templates/hadoop/slaves
  scripts/training/templates/thrax-hiero.conf
  scripts/training/templates/thrax-phrasal.conf
  scripts/training/templates/thrax-phrase-gt.conf
  scripts/training/templates/thrax-phrase.conf
  scripts/training/templates/thrax-samt.conf
  scripts/training/templates/tune/decoder_command
  scripts/training/templates/tune/decoder_command.qsub
  scripts/training/templates/tune/joshua.config
  scripts/training/TODO
  scripts/training/trim_parallel_corpus.pl
  scripts/training/unmap-html.pl
{code}

> Joshua 6.1 Release Management
> -
>
> Key: JOSHUA-323
> URL: https://issues.apache.org/jira/browse/JOSHUA-323
> Project: Joshua
>  Issue Type: Task
>  Components: build, release
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> This is a 

[jira] [Updated] (JOSHUA-317) SyntaxError: invalid syntax scripts/training/run_tuner.py", line 391

2016-11-10 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-317:

Fix Version/s: (was: 6.1)
   6.2

> SyntaxError: invalid syntax scripts/training/run_tuner.py", line 391
> 
>
> Key: JOSHUA-317
> URL: https://issues.apache.org/jira/browse/JOSHUA-317
> Project: Joshua
>  Issue Type: Bug
>  Components: tuner
>Affects Versions: 6.0.5
> Environment: Python 3.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 6.2
>
>
> {code}
> [tune-bundle] rebuilding...
>   
> dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/model/run-joshua.sh
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
> --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
> /usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/model 
> --copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
> -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
> tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
> "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp3/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
> /usr/local/joshua_resources/russian_experiments/exp3/grammar.packed --tm 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/grammar.glue
>   took 0 seconds (0s)
> [mert-1] rebuilding...
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
> [CHANGED]
>   dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> [CHANGED]
>   dep=tune/model/grammar.packed/slice_0.source [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru 
> --tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner 
> mert --decoder 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command 
> --decoder-config 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> --decoder-output-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest 
> --decoder-log-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log 
> --iterations 10 --metric 'BLEU 4 closest'
>   JOB FAILED (return code 1)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 391
> 'ITERATIONS': `iterations`,
>   ^
> SyntaxError: invalid syntax
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-323) Joshua 6.1 Release Management

2016-11-10 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-323:
---

 Summary: Joshua 6.1 Release Management
 Key: JOSHUA-323
 URL: https://issues.apache.org/jira/browse/JOSHUA-323
 Project: Joshua
  Issue Type: Task
  Components: release, build
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Blocker
 Fix For: 6.1


This is a governing ticket for reference more than anything else. We need to 
add all release specific build additions to parent pom.xml which enable us to 
roll a release candidate.
The process is also being documented over at 
https://cwiki.apache.org/confluence/display/JOSHUA/Joshua+Release+Management+Procedure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-321) Add JOSHUA env to ./bin/bleu and ./bin/extract-1best bash scripts

2016-11-09 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-321:
---

 Summary: Add JOSHUA env to ./bin/bleu and ./bin/extract-1best bash 
scripts
 Key: JOSHUA-321
 URL: https://issues.apache.org/jira/browse/JOSHUA-321
 Project: Joshua
  Issue Type: Bug
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Trivial
 Fix For: 6.1


Right now both bleu and extract-1best do not have the required $JOSHUA env 
variable which will result in an error if it is not set within the users 
environment. This currently breaks the Homebrew install amongst other things so 
we should add it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-318) scripts/training/run_tuner.py should enable configurable memory usage when invioking joshua-decoder

2016-11-02 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15629835#comment-15629835
 ] 

Lewis John McGibbney commented on JOSHUA-318:
-

Agreed, it's set for fix 6.2... if we ever release 6.2.

> scripts/training/run_tuner.py should enable configurable memory usage when 
> invioking joshua-decoder
> ---
>
> Key: JOSHUA-318
> URL: https://issues.apache.org/jira/browse/JOSHUA-318
> Project: Joshua
>  Issue Type: Improvement
>  Components: tuner
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
> Fix For: 6.2
>
>
> When I run the run_tuner.py script I can easily run into the following
> {code}
> [mert-1] rebuilding...
>   dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en
>   dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> [CHANGED]
>   dep=tune/model/grammar.gz.packed/slice_0.source [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru 
> --tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner 
> mert --decoder 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command 
> --decoder-config 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> --decoder-output-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest 
> --decoder-log-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log 
> --iterations 10 --metric 'BLEU 4 closest'
>   JOB FAILED (return code 1)
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar$PackedSlice.initializeFeatureStructures(PackedGrammar.java:385)
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar$PackedSlice.(PackedGrammar.java:368)
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar.(PackedGrammar.java:153)
>   at 
> org.apache.joshua.decoder.Decoder.initializeTranslationGrammars(Decoder.java:458)
>   at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:389)
>   at org.apache.joshua.decoder.Decoder.(Decoder.java:128)
>   at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:69)
> Traceback (most recent call last):
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 553, 
> in 
> main(sys.argv)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 536, 
> in main
> run_zmert(opts.tunedir, opts.source, opts.target, opts.decoder, 
> opts.decoder_config, opts.decoder_output_file, opts)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 417, 
> in run_zmert
> opts.metric, opts.iterations or 10)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 399, 
> in setup_configs
> for feature,weight in get_features(config):
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 351, 
> in get_features
> output = check_output("%s/bin/joshua-decoder -c %s -show-weights -v 0" % 
> (JOSHUA, config_file), shell=True)
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 626, in 
> check_output
> **kwargs).stdout
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 708, in 
> run
> output=stdout, stderr=stderr)
> subprocess.CalledProcessError: Command 
> '/usr/local/incubator-joshua/bin/joshua-decoder -c 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> -show-weights -v 0' returned non-zero exit status 1
> {code}
> This is because, by default the joshua-decoder script runs with 4g of memory. 
> The run_runer.py script should be flexible enough to continue with the memory 
> allocation provided when a pipe was initially invoked. This value should then 
> be passed to the joshua-decoder script.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-317) SyntaxError: invalid syntax scripts/training/run_tuner.py", line 391

2016-11-02 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15629831#comment-15629831
 ] 

Lewis John McGibbney commented on JOSHUA-317:
-

lmcgibbn@LMC-056430 /usr/local/joshua_resources/russian_experiments $ python 
--version
Python 3.5.2 :: Continuum Analytics, Inc.

> SyntaxError: invalid syntax scripts/training/run_tuner.py", line 391
> 
>
> Key: JOSHUA-317
> URL: https://issues.apache.org/jira/browse/JOSHUA-317
> Project: Joshua
>  Issue Type: Bug
>  Components: tuner
>Affects Versions: 6.0.5
> Environment: Python 3.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
>
> {code}
> [tune-bundle] rebuilding...
>   
> dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/model/run-joshua.sh
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
> --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
> /usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/model 
> --copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
> -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
> tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
> "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp3/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
> /usr/local/joshua_resources/russian_experiments/exp3/grammar.packed --tm 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/grammar.glue
>   took 0 seconds (0s)
> [mert-1] rebuilding...
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
> [CHANGED]
>   dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> [CHANGED]
>   dep=tune/model/grammar.packed/slice_0.source [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru 
> --tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner 
> mert --decoder 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command 
> --decoder-config 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> --decoder-output-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest 
> --decoder-log-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log 
> --iterations 10 --metric 'BLEU 4 closest'
>   JOB FAILED (return code 1)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 391
> 'ITERATIONS': `iterations`,
>   ^
> SyntaxError: invalid syntax
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (JOSHUA-319) test-decode decoder_command results in java.lang.NumberFormatException: For input string: "MAXSPAN"

2016-10-28 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved JOSHUA-319.
-
Resolution: Not A Problem

The issue is produced by the pipeline failing on the mert stage! The mert stage 
is then cached as a pseudo-complete status however the final config file is 
never truly produced. This causes the subsequent decoding task to fail.
I re-ran another pipeline, which just finished flawlessly.

> test-decode decoder_command results in java.lang.NumberFormatException: For 
> input string: "MAXSPAN"
> ---
>
> Key: JOSHUA-319
> URL: https://issues.apache.org/jira/browse/JOSHUA-319
> Project: Joshua
>  Issue Type: Bug
>  Components: decoders
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
>
> When I run the following command
> {code}
> /usr/local/incubator-joshua/bin/pipeline.pl  --rundir . --type hiero --corpus 
> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en --tune 
> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.tune 
> --test 
> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.test 
> --source en --target ru --readme "Experiment 3 Run 1 of ru --> en model 
> training" --aligner berkeley --hadoop-mem 10g --tmp 
> /usr/local/hadoop-2.5.2/hadoop_tmp_dir --first-step test --grammar 
> /usr/local/joshua_resources/russian_experiments/exp3/grammar.gz --joshua-mem 
> 10g
> {code}
> I end up with the following message.
> {code}
> INFO - Parameters read from configuration file: joshua.config
> INFO - tm = 'TYPE -maxspan MAXSPAN -owner OWNER -path 
> /usr/local/joshua_resources/russian_experiments/exp3/test/1/model/grammar.gz.packed'
> INFO - tm = 'thrax -maxspan -1 -owner glue -path 
> /usr/local/joshua_resources/russian_experiments/exp3/test/1/model/grammar.glue'
> INFO - defaultnonterminal = 'X'
> INFO - goalsymbol = 'GOAL'
> INFO - markoovs = 'false'
> INFO - search = 'cky'
> INFO - pop-limit: 5000
> INFO - poplimit = '5000'
> INFO - topn = '300'
> INFO - useuniquenbest = 'true'
> INFO - outputformat = '%i ||| %s ||| %f ||| %c'
> INFO - includealignindex = 'false'
> INFO - featurefunction = 'OOVPenalty'
> INFO - featurefunction = 'WordPenalty'
> INFO - c = 'joshua.config'
> INFO - threads = '1'
> INFO - topn = '0'
> INFO - outputformat = '%s'
> INFO - Read 3 weights (0 of them dense)
> Exception in thread "main" java.lang.NumberFormatException: For input string: 
> "MAXSPAN"
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Integer.parseInt(Integer.java:580)
>   at java.lang.Integer.parseInt(Integer.java:615)
>   at 
> org.apache.joshua.decoder.Decoder.initializeTranslationGrammars(Decoder.java:451)
>   at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:389)
>   at org.apache.joshua.decoder.Decoder.(Decoder.java:128)
>   at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:69)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-320) --joshua-mem pipeline parameter is not populated to mert processes

2016-10-27 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-320:
---

 Summary: --joshua-mem pipeline parameter is not populated to mert 
processes
 Key: JOSHUA-320
 URL: https://issues.apache.org/jira/browse/JOSHUA-320
 Project: Joshua
  Issue Type: Bug
  Components: mert, pipeline
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: 6.2


As we've discussed on the Joshua mailing list at 
http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg01765.html
it is not realistic to reserve only 4g for several tasks which are executed as 
part of a typical pipeline line.
In particular, MERT runs with 4g which is not enough. We should increase this 
to something like 8g or more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-319) test-decode decoder_command results in java.lang.NumberFormatException: For input string: "MAXSPAN"

2016-10-26 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15610743#comment-15610743
 ] 

Lewis John McGibbney commented on JOSHUA-319:
-

Some supplementary reading folks
http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg01769.html


> test-decode decoder_command results in java.lang.NumberFormatException: For 
> input string: "MAXSPAN"
> ---
>
> Key: JOSHUA-319
> URL: https://issues.apache.org/jira/browse/JOSHUA-319
> Project: Joshua
>  Issue Type: Bug
>  Components: decoders
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
>
> When I run the following command
> {code}
> /usr/local/incubator-joshua/bin/pipeline.pl  --rundir . --type hiero --corpus 
> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en --tune 
> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.tune 
> --test 
> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.test 
> --source en --target ru --readme "Experiment 3 Run 1 of ru --> en model 
> training" --aligner berkeley --hadoop-mem 10g --tmp 
> /usr/local/hadoop-2.5.2/hadoop_tmp_dir --first-step test --grammar 
> /usr/local/joshua_resources/russian_experiments/exp3/grammar.gz --joshua-mem 
> 10g
> {code}
> I end up with the following message.
> {code}
> INFO - Parameters read from configuration file: joshua.config
> INFO - tm = 'TYPE -maxspan MAXSPAN -owner OWNER -path 
> /usr/local/joshua_resources/russian_experiments/exp3/test/1/model/grammar.gz.packed'
> INFO - tm = 'thrax -maxspan -1 -owner glue -path 
> /usr/local/joshua_resources/russian_experiments/exp3/test/1/model/grammar.glue'
> INFO - defaultnonterminal = 'X'
> INFO - goalsymbol = 'GOAL'
> INFO - markoovs = 'false'
> INFO - search = 'cky'
> INFO - pop-limit: 5000
> INFO - poplimit = '5000'
> INFO - topn = '300'
> INFO - useuniquenbest = 'true'
> INFO - outputformat = '%i ||| %s ||| %f ||| %c'
> INFO - includealignindex = 'false'
> INFO - featurefunction = 'OOVPenalty'
> INFO - featurefunction = 'WordPenalty'
> INFO - c = 'joshua.config'
> INFO - threads = '1'
> INFO - topn = '0'
> INFO - outputformat = '%s'
> INFO - Read 3 weights (0 of them dense)
> Exception in thread "main" java.lang.NumberFormatException: For input string: 
> "MAXSPAN"
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Integer.parseInt(Integer.java:580)
>   at java.lang.Integer.parseInt(Integer.java:615)
>   at 
> org.apache.joshua.decoder.Decoder.initializeTranslationGrammars(Decoder.java:451)
>   at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:389)
>   at org.apache.joshua.decoder.Decoder.(Decoder.java:128)
>   at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:69)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-319) test-decode decoder_command results in java.lang.NumberFormatException: For input string: "MAXSPAN"

2016-10-26 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-319:
---

 Summary: test-decode decoder_command results in 
java.lang.NumberFormatException: For input string: "MAXSPAN"
 Key: JOSHUA-319
 URL: https://issues.apache.org/jira/browse/JOSHUA-319
 Project: Joshua
  Issue Type: Bug
  Components: decoders
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: 6.1


When I run the following command
{code}
/usr/local/incubator-joshua/bin/pipeline.pl  --rundir . --type hiero --corpus 
/usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en --tune 
/usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.tune 
--test 
/usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.test 
--source en --target ru --readme "Experiment 3 Run 1 of ru --> en model 
training" --aligner berkeley --hadoop-mem 10g --tmp 
/usr/local/hadoop-2.5.2/hadoop_tmp_dir --first-step test --grammar 
/usr/local/joshua_resources/russian_experiments/exp3/grammar.gz --joshua-mem 10g
{code}
I end up with the following message.
{code}
INFO - Parameters read from configuration file: joshua.config
INFO - tm = 'TYPE -maxspan MAXSPAN -owner OWNER -path 
/usr/local/joshua_resources/russian_experiments/exp3/test/1/model/grammar.gz.packed'
INFO - tm = 'thrax -maxspan -1 -owner glue -path 
/usr/local/joshua_resources/russian_experiments/exp3/test/1/model/grammar.glue'
INFO - defaultnonterminal = 'X'
INFO - goalsymbol = 'GOAL'
INFO - markoovs = 'false'
INFO - search = 'cky'
INFO - pop-limit: 5000
INFO - poplimit = '5000'
INFO - topn = '300'
INFO - useuniquenbest = 'true'
INFO - outputformat = '%i ||| %s ||| %f ||| %c'
INFO - includealignindex = 'false'
INFO - featurefunction = 'OOVPenalty'
INFO - featurefunction = 'WordPenalty'
INFO - c = 'joshua.config'
INFO - threads = '1'
INFO - topn = '0'
INFO - outputformat = '%s'
INFO - Read 3 weights (0 of them dense)
Exception in thread "main" java.lang.NumberFormatException: For input string: 
"MAXSPAN"
at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at 
org.apache.joshua.decoder.Decoder.initializeTranslationGrammars(Decoder.java:451)
at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:389)
at org.apache.joshua.decoder.Decoder.(Decoder.java:128)
at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:69)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-318) scripts/training/run_tuner.py should enable configurable memory usage when invioking joshua-decoder

2016-10-26 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15609503#comment-15609503
 ] 

Lewis John McGibbney commented on JOSHUA-318:
-

The following code is where the sh*t his the fan
{code}
def get_features(config_file):
"""Queries the decoder for all dense features that will be fired by the 
feature
functions activated in the config file"""

output = check_output("%s/bin/joshua-decoder -c %s -show-weights -v 0" % 
(JOSHUA, config_file), shell=True)
features = []
for index, item in enumerate(output.split('\n')):
if item != "":
features.append(tuple(item.split()))
return features
{code}

> scripts/training/run_tuner.py should enable configurable memory usage when 
> invioking joshua-decoder
> ---
>
> Key: JOSHUA-318
> URL: https://issues.apache.org/jira/browse/JOSHUA-318
> Project: Joshua
>  Issue Type: Improvement
>  Components: tuner
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
> Fix For: 6.2
>
>
> When I run the run_tuner.py script I can easily run into the following
> {code}
> [mert-1] rebuilding...
>   dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en
>   dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> [CHANGED]
>   dep=tune/model/grammar.gz.packed/slice_0.source [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru 
> --tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner 
> mert --decoder 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command 
> --decoder-config 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> --decoder-output-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest 
> --decoder-log-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log 
> --iterations 10 --metric 'BLEU 4 closest'
>   JOB FAILED (return code 1)
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar$PackedSlice.initializeFeatureStructures(PackedGrammar.java:385)
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar$PackedSlice.(PackedGrammar.java:368)
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar.(PackedGrammar.java:153)
>   at 
> org.apache.joshua.decoder.Decoder.initializeTranslationGrammars(Decoder.java:458)
>   at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:389)
>   at org.apache.joshua.decoder.Decoder.(Decoder.java:128)
>   at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:69)
> Traceback (most recent call last):
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 553, 
> in 
> main(sys.argv)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 536, 
> in main
> run_zmert(opts.tunedir, opts.source, opts.target, opts.decoder, 
> opts.decoder_config, opts.decoder_output_file, opts)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 417, 
> in run_zmert
> opts.metric, opts.iterations or 10)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 399, 
> in setup_configs
> for feature,weight in get_features(config):
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 351, 
> in get_features
> output = check_output("%s/bin/joshua-decoder -c %s -show-weights -v 0" % 
> (JOSHUA, config_file), shell=True)
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 626, in 
> check_output
> **kwargs).stdout
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 708, in 
> run
> output=stdout, stderr=stderr)
> subprocess.CalledProcessError: Command 
> '/usr/local/incubator-joshua/bin/joshua-decoder -c 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> -show-weights -v 0' returned non-zero exit status 1
> {code}
> This is because, by default the joshua-decoder script runs with 4g of memory. 
> The run_runer.py script should be flexible enough to continue with the memory 
> allocation provided when a pipe was initially invoked. This value should then 
> be passed to the joshua-decoder script.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-318) scripts/training/run_tuner.py should enable configurable memory usage when invioking joshua-decoder

2016-10-26 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-318:
---

 Summary: scripts/training/run_tuner.py should enable configurable 
memory usage when invioking joshua-decoder
 Key: JOSHUA-318
 URL: https://issues.apache.org/jira/browse/JOSHUA-318
 Project: Joshua
  Issue Type: Improvement
  Components: tuner
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
 Fix For: 6.2


When I run the run_tuner.py script I can easily run into the following
{code}
[mert-1] rebuilding...
  dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en
  dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
[CHANGED]
  dep=tune/model/grammar.gz.packed/slice_0.source [CHANGED]
  
dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final
 [NOT FOUND]
  cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py 
/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru 
--tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner 
mert --decoder 
/usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command 
--decoder-config 
/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
--decoder-output-file 
/usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest 
--decoder-log-file 
/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log 
--iterations 10 --metric 'BLEU 4 closest'
  JOB FAILED (return code 1)
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at 
org.apache.joshua.decoder.ff.tm.packed.PackedGrammar$PackedSlice.initializeFeatureStructures(PackedGrammar.java:385)
at 
org.apache.joshua.decoder.ff.tm.packed.PackedGrammar$PackedSlice.(PackedGrammar.java:368)
at 
org.apache.joshua.decoder.ff.tm.packed.PackedGrammar.(PackedGrammar.java:153)
at 
org.apache.joshua.decoder.Decoder.initializeTranslationGrammars(Decoder.java:458)
at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:389)
at org.apache.joshua.decoder.Decoder.(Decoder.java:128)
at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:69)
Traceback (most recent call last):
  File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 553, 
in 
main(sys.argv)
  File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 536, 
in main
run_zmert(opts.tunedir, opts.source, opts.target, opts.decoder, 
opts.decoder_config, opts.decoder_output_file, opts)
  File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 417, 
in run_zmert
opts.metric, opts.iterations or 10)
  File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 399, 
in setup_configs
for feature,weight in get_features(config):
  File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 351, 
in get_features
output = check_output("%s/bin/joshua-decoder -c %s -show-weights -v 0" % 
(JOSHUA, config_file), shell=True)
  File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 626, in 
check_output
**kwargs).stdout
  File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 708, in 
run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 
'/usr/local/incubator-joshua/bin/joshua-decoder -c 
/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
-show-weights -v 0' returned non-zero exit status 1
{code}
This is because, by default the joshua-decoder script runs with 4g of memory. 
The run_runer.py script should be flexible enough to continue with the memory 
allocation provided when a pipe was initially invoked. This value should then 
be passed to the joshua-decoder script.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-317) SyntaxError: invalid syntax scripts/training/run_tuner.py", line 391

2016-10-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-317:

Component/s: (was: er)
 tuner

> SyntaxError: invalid syntax scripts/training/run_tuner.py", line 391
> 
>
> Key: JOSHUA-317
> URL: https://issues.apache.org/jira/browse/JOSHUA-317
> Project: Joshua
>  Issue Type: Bug
>  Components: tuner
>Affects Versions: 6.0.5
> Environment: Python 3.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
>
> {code}
> [tune-bundle] rebuilding...
>   
> dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/model/run-joshua.sh
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
> --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
> /usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/model 
> --copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
> -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
> tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
> "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp3/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
> /usr/local/joshua_resources/russian_experiments/exp3/grammar.packed --tm 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/grammar.glue
>   took 0 seconds (0s)
> [mert-1] rebuilding...
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
> [CHANGED]
>   dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> [CHANGED]
>   dep=tune/model/grammar.packed/slice_0.source [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru 
> --tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner 
> mert --decoder 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command 
> --decoder-config 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> --decoder-output-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest 
> --decoder-log-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log 
> --iterations 10 --metric 'BLEU 4 closest'
>   JOB FAILED (return code 1)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 391
> 'ITERATIONS': `iterations`,
>   ^
> SyntaxError: invalid syntax
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-317) SyntaxError: invalid syntax scripts/training/run_tuner.py", line 391

2016-10-26 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-317:
---

 Summary: SyntaxError: invalid syntax 
scripts/training/run_tuner.py", line 391
 Key: JOSHUA-317
 URL: https://issues.apache.org/jira/browse/JOSHUA-317
 Project: Joshua
  Issue Type: Bug
  Components: er
Affects Versions: 6.0.5
 Environment: Python 3.5
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: 6.1


{code}
[tune-bundle] rebuilding...
  dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
[CHANGED]
  
dep=/usr/local/joshua_resources/russian_experiments/exp3/grammar.packed/slice_0.source
 [CHANGED]
  
dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/model/run-joshua.sh
 [NOT FOUND]
  cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
--symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
/usr/local/joshua_resources/russian_experiments/exp3/tune/model 
--copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
-mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
"StateMinimizingLanguageModel -lm_order 5 -lm_file 
/usr/local/joshua_resources/russian_experiments/exp3/lm.kenlm"  -tm0/type hiero 
-tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
/usr/local/joshua_resources/russian_experiments/exp3/grammar.packed --tm 
/usr/local/joshua_resources/russian_experiments/exp3/data/tune/grammar.glue
  took 0 seconds (0s)
[mert-1] rebuilding...
  dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
[CHANGED]
  dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
[CHANGED]
  dep=tune/model/grammar.packed/slice_0.source [CHANGED]
  
dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final
 [NOT FOUND]
  cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py 
/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru 
--tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner 
mert --decoder 
/usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command 
--decoder-config 
/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
--decoder-output-file 
/usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest 
--decoder-log-file 
/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log 
--iterations 10 --metric 'BLEU 4 closest'
  JOB FAILED (return code 1)
  File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 391
'ITERATIONS': `iterations`,
  ^
SyntaxError: invalid syntax
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (JOSHUA-259) Integration tests are failing

2016-10-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney closed JOSHUA-259.
---
Resolution: Not A Problem

> Integration tests are failing
> -
>
> Key: JOSHUA-259
> URL: https://issues.apache.org/jira/browse/JOSHUA-259
> Project: Joshua
>  Issue Type: Bug
>Reporter: Kellen Sunderland
> Fix For: 6.1
>
>
> Several integration tests are currently failing with Joshua.  I have a quick 
> fix coming for one of the tests but just in case we need more discussion 
> around the failures I'll open a bug.
> The currently failing tests for me:
> test/decoder/too-long
> test/server/http
> test/server/tcp-text
> test/thrax/extraction
> and 
> test/decoder/moses-compat (but this is easy to fix, simple extra space in the 
> expected file)
> These are failing under OS X 10.11.  If working under other environments feel 
> free to post a 'works for me'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (JOSHUA-259) Integration tests are failing

2016-10-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reopened JOSHUA-259:
-

> Integration tests are failing
> -
>
> Key: JOSHUA-259
> URL: https://issues.apache.org/jira/browse/JOSHUA-259
> Project: Joshua
>  Issue Type: Bug
>Reporter: Kellen Sunderland
> Fix For: 6.1
>
>
> Several integration tests are currently failing with Joshua.  I have a quick 
> fix coming for one of the tests but just in case we need more discussion 
> around the failures I'll open a bug.
> The currently failing tests for me:
> test/decoder/too-long
> test/server/http
> test/server/tcp-text
> test/thrax/extraction
> and 
> test/decoder/moses-compat (but this is easy to fix, simple extra space in the 
> expected file)
> These are failing under OS X 10.11.  If working under other environments feel 
> free to post a 'works for me'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-259) Integration tests are failing

2016-10-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-259:

Fix Version/s: (was: 6.2)
   6.1

> Integration tests are failing
> -
>
> Key: JOSHUA-259
> URL: https://issues.apache.org/jira/browse/JOSHUA-259
> Project: Joshua
>  Issue Type: Bug
>Reporter: Kellen Sunderland
> Fix For: 6.1
>
>
> Several integration tests are currently failing with Joshua.  I have a quick 
> fix coming for one of the tests but just in case we need more discussion 
> around the failures I'll open a bug.
> The currently failing tests for me:
> test/decoder/too-long
> test/server/http
> test/server/tcp-text
> test/thrax/extraction
> and 
> test/decoder/moses-compat (but this is easy to fix, simple extra space in the 
> expected file)
> These are failing under OS X 10.11.  If working under other environments feel 
> free to post a 'works for me'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (JOSHUA-71) OS X installation depends on coreutils to run thrax test

2016-10-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-71?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reopened JOSHUA-71:


> OS X installation depends on coreutils to run thrax test
> 
>
> Key: JOSHUA-71
> URL: https://issues.apache.org/jira/browse/JOSHUA-71
> Project: Joshua
>  Issue Type: Bug
>Reporter: Luke Orland
> Fix For: 6.1
>
>
> the {{gstat}} command from coreutils is not installed in Darwin by default. 
> One must resolve that dependency via Homebrew, Macports, etc.
> The {{test/thrax/test.sh}} test will fail on an OS X system that does not 
> have coreutils installed. We should either change the test so that it does 
> not require coreutils in Darwin or make it clear in the (developer) 
> installation/setup instructions that coreutils are required for this test, 
> check for coreutils when running the thrax test, and output a helpful message 
> instructing the developer to go install coreutils if {{gstat}} is not found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (JOSHUA-95) Vocabulary locking

2016-10-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-95?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reopened JOSHUA-95:


> Vocabulary locking
> --
>
> Key: JOSHUA-95
> URL: https://issues.apache.org/jira/browse/JOSHUA-95
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Juri Ganitkevitch
> Fix For: 6.1
>
>
> Vocabulary::id() is still synchronized and a potential point of contention. 
> It would be nice to resolve this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (JOSHUA-100) Add Shen et al. (2008) dependency LM

2016-10-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reopened JOSHUA-100:
-

> Add Shen et al. (2008) dependency LM
> 
>
> Key: JOSHUA-100
> URL: https://issues.apache.org/jira/browse/JOSHUA-100
> Project: Joshua
>  Issue Type: New Feature
>Reporter: Matt Post
>Assignee: Matt Post
> Fix For: 6.1
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (JOSHUA-107) Verbosity levels

2016-10-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney closed JOSHUA-107.
---

> Verbosity levels
> 
>
> Key: JOSHUA-107
> URL: https://issues.apache.org/jira/browse/JOSHUA-107
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Matt Post
> Fix For: 6.1
>
>
> Joshua should support verbosity levels with a command-line switch, so it's 
> easy to shut it up with something like {{-v 0}} or {{-q}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (JOSHUA-100) Add Shen et al. (2008) dependency LM

2016-10-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney closed JOSHUA-100.
---
Resolution: Fixed

> Add Shen et al. (2008) dependency LM
> 
>
> Key: JOSHUA-100
> URL: https://issues.apache.org/jira/browse/JOSHUA-100
> Project: Joshua
>  Issue Type: New Feature
>Reporter: Matt Post
>Assignee: Matt Post
> Fix For: 6.1
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-100) Add Shen et al. (2008) dependency LM

2016-10-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-100:

Fix Version/s: (was: 6.2)
   6.1

> Add Shen et al. (2008) dependency LM
> 
>
> Key: JOSHUA-100
> URL: https://issues.apache.org/jira/browse/JOSHUA-100
> Project: Joshua
>  Issue Type: New Feature
>Reporter: Matt Post
>Assignee: Matt Post
> Fix For: 6.1
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-95) Vocabulary locking

2016-10-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-95?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-95:
---
Fix Version/s: (was: 6.2)
   6.1

> Vocabulary locking
> --
>
> Key: JOSHUA-95
> URL: https://issues.apache.org/jira/browse/JOSHUA-95
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Juri Ganitkevitch
> Fix For: 6.1
>
>
> Vocabulary::id() is still synchronized and a potential point of contention. 
> It would be nice to resolve this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (JOSHUA-22) Parallelize MBR computation

2016-10-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-22?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reopened JOSHUA-22:


> Parallelize MBR computation
> ---
>
> Key: JOSHUA-22
> URL: https://issues.apache.org/jira/browse/JOSHUA-22
> Project: Joshua
>  Issue Type: Bug
>Reporter: Joshua Decoder
> Fix For: 6.1
>
>
> MBR should be multithreaded.  This would be easy to add following the model 
> used in the InputManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-22) Parallelize MBR computation

2016-10-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-22?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-22:
---
Fix Version/s: (was: 6.2)
   6.1

> Parallelize MBR computation
> ---
>
> Key: JOSHUA-22
> URL: https://issues.apache.org/jira/browse/JOSHUA-22
> Project: Joshua
>  Issue Type: Bug
>Reporter: Joshua Decoder
> Fix For: 6.1
>
>
> MBR should be multithreaded.  This would be easy to add following the model 
> used in the InputManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (JOSHUA-95) Vocabulary locking

2016-10-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-95?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney closed JOSHUA-95.
--
Resolution: Fixed

> Vocabulary locking
> --
>
> Key: JOSHUA-95
> URL: https://issues.apache.org/jira/browse/JOSHUA-95
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Juri Ganitkevitch
> Fix For: 6.1
>
>
> Vocabulary::id() is still synchronized and a potential point of contention. 
> It would be nice to resolve this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-316) run_bundler.py returning JOB FAILED (return code 1) TypeError: memoryview: a bytes-like object is required, not 'str'

2016-10-25 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-316:

Fix Version/s: (was: 6.2)
   6.1

> run_bundler.py returning JOB FAILED (return code 1) TypeError: memoryview: a 
> bytes-like object is required, not 'str'
> -
>
> Key: JOSHUA-316
> URL: https://issues.apache.org/jira/browse/JOSHUA-316
> Project: Joshua
>  Issue Type: Bug
>  Components: bundler
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Critical
> Fix For: 6.1
>
>
> {code}
> [glue-tune] rebuilding...
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/create_glue_grammar.sh 
> /usr/local/joshua_resources/russian_experiments/exp2/grammar.packed > 
> /usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>   took 1 seconds (1s)
> [tune-bundle] rebuilding...
>   
> dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/tune/model/run-joshua.sh
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
> --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
> /usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> /usr/local/joshua_resources/russian_experiments/exp2/tune/model 
> --copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
> -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
> tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
> "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
> /usr/local/joshua_resources/russian_experiments/exp2/grammar.packed --tm 
> /usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>   JOB FAILED (return code 1)
> * Running the copy-config.pl script with the command: 
> /usr/local/incubator-joshua/scripts/copy-config.pl -top-n 300 -output-format 
> "%i ||| %s ||| %f ||| %c" -mark-oovs false -search cky -weights "lm_0 1 
> tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " 
> -feature-function "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue
> Traceback (most recent call last):
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 748, in main
> operations = collect_operations(opts)
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 637, in collect_operations
> opts.copy_config_options
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 202, in filter_through_copy_config_script
> result, err = p.communicate(config_text)
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1072, 
> in communicate
> stdout, stderr = self._communicate(input, endtime, timeout)
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1700, 
> in _communicate
> input_view = memoryview(self._input)
> TypeError: memoryview: a bytes-like object is required, not 'str'
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 760, in 
> main(sys.argv)
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 751, in main
> error_quit(e.message)
> AttributeError: 'TypeError' object has no attribute 'message'
> * WARNING: no key 'outputformat' found in config file (appending to end)
> * WARNING: no key 'search' found in config file (appending to end)
> * WARNING: no key 'topn' found in config file (appending to end)
> * WARNING: no key 'markoovs' found in config file (appending to end)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-316) run_bundler.py returning JOB FAILED (return code 1) TypeError: memoryview: a bytes-like object is required, not 'str'

2016-10-21 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-316:

Summary: run_bundler.py returning JOB FAILED (return code 1) TypeError: 
memoryview: a bytes-like object is required, not 'str'  (was: run_bundler.py 
returning JOB FAILED (return code 1))

> run_bundler.py returning JOB FAILED (return code 1) TypeError: memoryview: a 
> bytes-like object is required, not 'str'
> -
>
> Key: JOSHUA-316
> URL: https://issues.apache.org/jira/browse/JOSHUA-316
> Project: Joshua
>  Issue Type: Bug
>  Components: bundler
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Critical
> Fix For: 6.2
>
>
> {code}
> [glue-tune] rebuilding...
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/create_glue_grammar.sh 
> /usr/local/joshua_resources/russian_experiments/exp2/grammar.packed > 
> /usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>   took 1 seconds (1s)
> [tune-bundle] rebuilding...
>   
> dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/tune/model/run-joshua.sh
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
> --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
> /usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> /usr/local/joshua_resources/russian_experiments/exp2/tune/model 
> --copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
> -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
> tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
> "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
> /usr/local/joshua_resources/russian_experiments/exp2/grammar.packed --tm 
> /usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>   JOB FAILED (return code 1)
> * Running the copy-config.pl script with the command: 
> /usr/local/incubator-joshua/scripts/copy-config.pl -top-n 300 -output-format 
> "%i ||| %s ||| %f ||| %c" -mark-oovs false -search cky -weights "lm_0 1 
> tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " 
> -feature-function "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue
> Traceback (most recent call last):
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 748, in main
> operations = collect_operations(opts)
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 637, in collect_operations
> opts.copy_config_options
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 202, in filter_through_copy_config_script
> result, err = p.communicate(config_text)
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1072, 
> in communicate
> stdout, stderr = self._communicate(input, endtime, timeout)
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1700, 
> in _communicate
> input_view = memoryview(self._input)
> TypeError: memoryview: a bytes-like object is required, not 'str'
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 760, in 
> main(sys.argv)
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 751, in main
> error_quit(e.message)
> AttributeError: 'TypeError' object has no attribute 'message'
> * WARNING: no key 'outputformat' found in config file (appending to end)
> * WARNING: no key 'search' found in config file (appending to end)
> * WARNING: no key 'topn' found in config file (appending to end)
> * WARNING: no key 'markoovs' found in config file (appending to end)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-316) run_bundler.py returning JOB FAILED (return code 1)

2016-10-21 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-316:
---

 Summary: run_bundler.py returning JOB FAILED (return code 1)
 Key: JOSHUA-316
 URL: https://issues.apache.org/jira/browse/JOSHUA-316
 Project: Joshua
  Issue Type: Bug
  Components: bundler
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Priority: Critical
 Fix For: 6.2


{code}
[glue-tune] rebuilding...
  
dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
 [CHANGED]
  
dep=/usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue 
[NOT FOUND]
  cmd=/usr/local/incubator-joshua/scripts/support/create_glue_grammar.sh 
/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed > 
/usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
  took 1 seconds (1s)
[tune-bundle] rebuilding...
  dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
[CHANGED]
  
dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
 [CHANGED]
  
dep=/usr/local/joshua_resources/russian_experiments/exp2/tune/model/run-joshua.sh
 [NOT FOUND]
  cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
--symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
/usr/local/joshua_resources/russian_experiments/exp2/tune/model 
--copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
-mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
"StateMinimizingLanguageModel -lm_order 5 -lm_file 
/usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type hiero 
-tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed --tm 
/usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
  JOB FAILED (return code 1)
* Running the copy-config.pl script with the command: 
/usr/local/incubator-joshua/scripts/copy-config.pl -top-n 300 -output-format 
"%i ||| %s ||| %f ||| %c" -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 
1 tm_pt_1 1 tm_pt_2 1 tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " 
-feature-function "StateMinimizingLanguageModel -lm_order 5 -lm_file 
/usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type hiero 
-tm0/owner pt -tm0/maxspan 20 -tm1/owner glue
Traceback (most recent call last):
  File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 748, 
in main
operations = collect_operations(opts)
  File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 637, 
in collect_operations
opts.copy_config_options
  File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 202, 
in filter_through_copy_config_script
result, err = p.communicate(config_text)
  File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1072, in 
communicate
stdout, stderr = self._communicate(input, endtime, timeout)
  File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1700, in 
_communicate
input_view = memoryview(self._input)
TypeError: memoryview: a bytes-like object is required, not 'str'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 760, 
in 
main(sys.argv)
  File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 751, 
in main
error_quit(e.message)
AttributeError: 'TypeError' object has no attribute 'message'
* WARNING: no key 'outputformat' found in config file (appending to end)
* WARNING: no key 'search' found in config file (appending to end)
* WARNING: no key 'topn' found in config file (appending to end)
* WARNING: no key 'markoovs' found in config file (appending to end)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-312) Even though alignment is cached, it is always re-done in pipeline re-execution

2016-10-18 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586111#comment-15586111
 ] 

Lewis John McGibbney commented on JOSHUA-312:
-

boom goes the dynamite :)
Thanks [~post]

> Even though alignment is cached, it is always re-done in pipeline re-execution
> --
>
> Key: JOSHUA-312
> URL: https://issues.apache.org/jira/browse/JOSHUA-312
> Project: Joshua
>  Issue Type: Improvement
>  Components: alignment
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Critical
> Fix For: 6.1
>
>
> Say if a pipeline fails after alignment. The alignment result is never cached 
> and it becomes necessary to undertake alignment... again!
> We should investigate the process for caching alignments as it would really 
> speed up rerunning end-to-end pipelines for large input datasets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (JOSHUA-312) Even though alignment is cached, it is always re-done in pipeline re-execution

2016-10-13 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reassigned JOSHUA-312:
---

Assignee: Lewis John McGibbney

> Even though alignment is cached, it is always re-done in pipeline re-execution
> --
>
> Key: JOSHUA-312
> URL: https://issues.apache.org/jira/browse/JOSHUA-312
> Project: Joshua
>  Issue Type: Improvement
>  Components: alignment
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Critical
> Fix For: 6.2
>
>
> Say if a pipeline fails after alignment. The alignment result is never cached 
> and it becomes necessary to undertake alignment... again!
> We should investigate the process for caching alignments as it would really 
> speed up rerunning end-to-end pipelines for large input datasets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-312) Even though alignment is cached, it is always re-done in pipeline re-execution

2016-10-13 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15573593#comment-15573593
 ] 

Lewis John McGibbney commented on JOSHUA-312:
-

OK doke... I managed to reproduce this today.
So one of my pipelines just failed, this has to do with me screwing up my 
paths... however this was after alignment with berkeley aligner.
When I went to re-reun the code as follows, alignment was not pulled from the 
cache... it is completely re-run
{code}
lmcgibbn@LMC-056430 /usr/local/joshua_resources/russian_experiments $ ls -al
total 8
drwxr-xr-x   7 lmcgibbn  wheel  238 Oct 13 16:48 .
drwxr-xr-x  22 lmcgibbn  wheel  748 Oct 13 12:09 ..
drwxr-xr-x  29 lmcgibbn  wheel  986 Oct 13 16:48 .cachepipe
-rw-r--r--   1 lmcgibbn  wheel   47 Oct 13 12:24 README
drwxr-xr-x   5 lmcgibbn  wheel  170 Oct 13 16:48 alignments
drwxr-xr-x  12 lmcgibbn  wheel  408 Oct 13 12:23 data
drwxr-xr-x   6 lmcgibbn  wheel  204 Oct 13 12:24 scripts
lmcgibbn@LMC-056430 /usr/local/joshua_resources/russian_experiments $ 
/usr/local/incubator-joshua/bin/pipeline.pl  --rundir . --type hiero --corpus 
/usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en --tune 
/usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.tune 
--test 
/usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.test 
--source en --target ru --readme "Experiment 1 Run 1 of ru --> en model 
training" --aligner berkeley
[train-copy-and-filter] cached, skipping...
[train-tokenize-en] cached, skipping...
[train-tokenize-ru] cached, skipping...
[train-trim] cached, skipping...
[train-lowercase-en] cached, skipping...
[train-lowercase-ru] cached, skipping...
[train-vocab-en] cached, skipping...
[train-vocab-ru] cached, skipping...
[tune-copy-and-filter] cached, skipping...
[tune-tokenize-en] cached, skipping...
[tune-tokenize-ru] cached, skipping...
[tune-lowercase-en] cached, skipping...
[tune-lowercase-ru] cached, skipping...
[tune-vocab-en] cached, skipping...
[tune-vocab-ru] cached, skipping...
[test-copy-and-filter] cached, skipping...
[test-tokenize-en] cached, skipping...
[test-tokenize-ru] cached, skipping...
[test-lowercase-en] cached, skipping...
[test-lowercase-ru] cached, skipping...
[test-vocab-en] cached, skipping...
[test-vocab-ru] cached, skipping...
[source-numlines] cached, skipping...
[source-numlines] retrieved cached result =>   817962
[berkeley-aligner-chunk-0] rebuilding...
  dep=alignments/0/word-align.conf
  
dep=/usr/local/joshua_resources/russian_experiments/data/train/splits/corpus.en.0
 [NOT FOUND]
  
dep=/usr/local/joshua_resources/russian_experiments/data/train/splits/corpus.ru.0
 [NOT FOUND]
  dep=alignments/0/training.align [NOT FOUND]
  cmd=java -d64 -Xmx10g -jar 
/usr/local/incubator-joshua/ext/berkeleyaligner/distribution/berkeleyaligner.jar
 ++alignments/0/word-align.conf
{code}

The aligner looks as follows

{code}
lmcgibbn@LMC-056430 /usr/local $ tail -f 
joshua_resources/russian_experiments/alignments/0/log
main() {
  Execution directory: alignments/0
  Preparing Training Data {
ERROR: No files found at source /dev/null
  } [23s, cum. 23s]
  817962 training sentences, 0 test sentences
  Training models: 2 stages {
Training stage 1: MODEL1 and MODEL1 jointly for 5 iterations {
  Initializing forward model
 [1m16s, cum. 1m16s]
  Initializing reverse model [1m36s, cum. 2m53s]
  Joint Train: 817962 sentences, jointly {
Iteration 1/5 {
  Sentence 1/817962
  Sentence 2/817962
  Sentence 3/817962
  Sentence 11/817962
  Sentence 40/817962
  Sentence 146/817962
...
{code}

It would therefore appear to me that YES, the pipeline is cached, however on 
re-runs, the cache is not consulted and therefore alignment is repeated.

> Even though alignment is cached, it is always re-done in pipeline re-execution
> --
>
> Key: JOSHUA-312
> URL: https://issues.apache.org/jira/browse/JOSHUA-312
> Project: Joshua
>  Issue Type: Improvement
>  Components: alignment
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Critical
> Fix For: 6.2
>
>
> Say if a pipeline fails after alignment. The alignment result is never cached 
> and it becomes necessary to undertake alignment... again!
> We should investigate the process for caching alignments as it would really 
> speed up rerunning end-to-end pipelines for large input datasets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-312) Even though alignment is cached, it is always re-done in pipeline re-execution

2016-09-22 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-312:

Summary: Even though alignment is cached, it is always re-done in pipeline 
re-execution  (was: Alignment is never cached)

> Even though alignment is cached, it is always re-done in pipeline re-execution
> --
>
> Key: JOSHUA-312
> URL: https://issues.apache.org/jira/browse/JOSHUA-312
> Project: Joshua
>  Issue Type: Improvement
>  Components: alignment
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Critical
> Fix For: 6.2
>
>
> Say if a pipeline fails after alignment. The alignment result is never cached 
> and it becomes necessary to undertake alignment... again!
> We should investigate the process for caching alignments as it would really 
> speed up rerunning end-to-end pipelines for large input datasets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-312) Alignment is never cached

2016-09-21 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-312:
---

 Summary: Alignment is never cached
 Key: JOSHUA-312
 URL: https://issues.apache.org/jira/browse/JOSHUA-312
 Project: Joshua
  Issue Type: Improvement
  Components: alignment
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Priority: Critical
 Fix For: 6.2


Say if a pipeline fails after alignment. The alignment result is never cached 
and it becomes necessary to undertake alignment... again!
We should investigate the process for caching alignments as it would really 
speed up rerunning end-to-end pipelines for large input datasets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-299) Move regression tests to proper unit tests

2016-09-09 Thread lewis john mcgibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15477498#comment-15477498
 ] 

lewis john mcgibbney commented on JOSHUA-299:
-

Mvn clean test is the way to go




-- 
http://home.apache.org/~lewismc/
@hectorMcSpector
http://www.linkedin.com/in/lmcgibbney


> Move regression tests to proper unit tests
> --
>
> Key: JOSHUA-299
> URL: https://issues.apache.org/jira/browse/JOSHUA-299
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
>
> Many of the regression tests (test*.sh under src/test/resources) have been 
> moved to proper unit tests, but this move should be completed, and the 
> regression tests should be deleted. This should be done for 6.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-299) Move regression tests to proper unit tests

2016-09-07 Thread lewis john mcgibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15471850#comment-15471850
 ] 

lewis john mcgibbney commented on JOSHUA-299:
-

Nope did not sorry. Please progress!




-- 
http://home.apache.org/~lewismc/
@hectorMcSpector
http://www.linkedin.com/in/lmcgibbney


> Move regression tests to proper unit tests
> --
>
> Key: JOSHUA-299
> URL: https://issues.apache.org/jira/browse/JOSHUA-299
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
>
> Many of the regression tests (test*.sh under src/test/resources) have been 
> moved to proper unit tests, but this move should be completed, and the 
> regression tests should be deleted. This should be done for 6.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-304) word-align.conf alignment template file not compatible with berkeley aligner

2016-08-29 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15446876#comment-15446876
 ] 

Lewis John McGibbney commented on JOSHUA-304:
-

[~post] np at all. No need for sorry.
I just tested after clean download of third party deps that this works a charm. 
Thanks for looking in to it I really appreciate it.
I am +1 for merge into master and resolve this as fixed [~post]

> word-align.conf alignment template file not compatible with berkeley aligner
> 
>
> Key: JOSHUA-304
> URL: https://issues.apache.org/jira/browse/JOSHUA-304
> Project: Joshua
>  Issue Type: Bug
>  Components: alignment, berkeley, templates
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> It takes me quite some time to debug what was going on and why pipeline's 
> were failing when using the berkeley aligner.
> It turns out that the word-align.conf template provided at
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf
> is not compatible with the berkeley aligner. 
> In particular the following lines are non compatible
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf#L12-L15
> Evidence of this is provided below
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1, HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'JOINT JOINT'; valid choices: FORWARD|REVERSE|BOTH_INDEP|JOINT
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Exception in thread "main" java.lang.NumberFormatException: For input string: 
> "5 5"
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Integer.parseInt(Integer.java:580)
>   at java.lang.Integer.parseInt(Integer.java:615)
>   at 
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:143)
>   at 
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:240)
>   at edu.berkeley.nlp.fig.basic.OptInfo.set(OptionsParser.java:294)
>   at 
> edu.berkeley.nlp.fig.basic.OptionsParser.readOptionsFile(OptionsParser.java:555)
>   at 
> edu.berkeley.nlp.fig.basic.OptionsParser.doParse(OptionsParser.java:604)
>   at edu.berkeley.nlp.fig.exec.Execution.init(Execution.java:293)
>   at edu.berkeley.nlp.wordAlignment.Main.main(Main.java:149)
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Cannot create directory: alignments/0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-304) word-align.conf alignment template file not compatible with berkeley aligner

2016-08-29 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15446643#comment-15446643
 ] 

Lewis John McGibbney commented on JOSHUA-304:
-

Hi [~post]
What new steps did you actually add?
I've wiped everything that was generated by Joshua. I've rebuilt JOSHUA-304 
branch. I'm getting the following

{code}
$JOSHUA/bin/pipeline.pl --type hiero --rundir 
/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0 --readme 
"Baseline Hiero run 0 --lm-gen berkeleylm --lm berkeleylm --aligner berkeley 
JOSHUA-304" --source es --target en --lm-gen berkeleylm --lm berkeleylm 
--aligner berkeley --corpus $SPANISH/corpus/asr/callhome_train --corpus 
$SPANISH/corpus/asr/fisher_train --tune  $SPANISH/corpus/asr/fisher_dev --test  
$SPANISH/corpus/asr/callhome_devtest
...
snip
...
[test-vocab-es] rebuilding...
  
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/corpus.es
 [CHANGED]
  
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/vocab.es
 [NOT FOUND]
  cmd=cat 
/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/corpus.es
 | /usr/local/incubator-joshua/scripts/training/build-vocab.pl > 
/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/vocab.es
  took 0 seconds (0s)
[test-vocab-en] rebuilding...
  
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/corpus.en
 [CHANGED]
  
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/vocab.en
 [NOT FOUND]
  cmd=cat 
/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/corpus.en
 | /usr/local/incubator-joshua/scripts/training/build-vocab.pl > 
/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/vocab.en
  took 0 seconds (0s)
[source-numlines] rebuilding...
  
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/train/corpus.es
 [CHANGED]
  cmd=cat 
/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/train/corpus.es
 | wc -l
  took 0 seconds (0s)
[source-numlines] retrieved cached result =>   151810
[berkeley-aligner-chunk-0] rebuilding...
  dep=alignments/0/word-align.conf [CHANGED]
  
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/train/splits/corpus.es.0
 [NOT FOUND]
  
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/train/splits/corpus.en.0
 [NOT FOUND]
  dep=alignments/0/training.align [NOT FOUND]
  cmd=java -d64 -Xmx10g -jar 
/usr/local/incubator-joshua/ext/berkeleyaligner/distribution/berkeleyaligner.jar
 ++alignments/0/word-align.conf
  JOB FAILED (return code 1)
[aligner-combine] rebuilding...
  dep=alignments/0/training.en-es.align [NOT FOUND]
  dep=alignments/training.align [NOT FOUND]
  cmd=cat alignments/0/training.en-es.align > alignments/training.align
  JOB FAILED (return code 1)
cat: alignments/0/training.en-es.align: No such file or directory
{code}

> word-align.conf alignment template file not compatible with berkeley aligner
> 
>
> Key: JOSHUA-304
> URL: https://issues.apache.org/jira/browse/JOSHUA-304
> Project: Joshua
>  Issue Type: Bug
>  Components: alignment, berkeley, templates
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> It takes me quite some time to debug what was going on and why pipeline's 
> were failing when using the berkeley aligner.
> It turns out that the word-align.conf template provided at
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf
> is not compatible with the berkeley aligner. 
> In particular the following lines are non compatible
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf#L12-L15
> Evidence of this is provided below
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1, HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> 

[jira] [Commented] (JOSHUA-297) List supported versions of Hadoop

2016-08-24 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436177#comment-15436177
 ] 

Lewis John McGibbney commented on JOSHUA-297:
-

The supported version is 2.5.2
https://github.com/joshua-decoder/thrax/blob/master/.classpath#L8


> List supported versions of Hadoop
> -
>
> Key: JOSHUA-297
> URL: https://issues.apache.org/jira/browse/JOSHUA-297
> Project: Joshua
>  Issue Type: Task
>Reporter: Bob Paulin
>Assignee: Matt Post
>Priority: Minor
> Fix For: 6.1
>
> Attachments: thrax-hadoop0.20.2.log, thrax-hadoop2.6.4.log
>
>
> When working through the training tutorial I noticed that no version of 
> Hadoop was listed so I tried the latest Hadoop 2.6.4.  The Thrax Job failed 
> on this version.  It worked however with 0.20.2 .  I found this on 
> http://joshua.incubator.apache.org/6.0/pipeline.html by hovering over a link 
> on the Hadoop section.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (JOSHUA-305) joshua-6.1-SNAPSHOT-source-release.zip takes ages to build

2016-08-24 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved JOSHUA-305.
-
Resolution: Not A Bug

This was due to a large language model being present within the joshua 
directory. This is not an issue.

> joshua-6.1-SNAPSHOT-source-release.zip takes ages to build
> --
>
> Key: JOSHUA-305
> URL: https://issues.apache.org/jira/browse/JOSHUA-305
> Project: Joshua
>  Issue Type: Bug
>  Components: build, core
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> When someone runs mvn clean install, the 
> joshua-6.1-SNAPSHOT-source-release.zip step takes absolutely ages to build. 
> We should investigate why this is the case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-305) joshua-6.1-SNAPSHOT-source-release.zip takes ages to build

2016-08-24 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-305:
---

 Summary: joshua-6.1-SNAPSHOT-source-release.zip takes ages to build
 Key: JOSHUA-305
 URL: https://issues.apache.org/jira/browse/JOSHUA-305
 Project: Joshua
  Issue Type: Bug
  Components: build, core
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Priority: Blocker
 Fix For: 6.1


When someone runs mvn clean install, the joshua-6.1-SNAPSHOT-source-release.zip 
step takes absolutely ages to build. We should investigate why this is the case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-304) word-align.conf alignment template file not compatible with berkeley aligner

2016-08-24 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435615#comment-15435615
 ] 

Lewis John McGibbney commented on JOSHUA-304:
-

ACK will do.

> word-align.conf alignment template file not compatible with berkeley aligner
> 
>
> Key: JOSHUA-304
> URL: https://issues.apache.org/jira/browse/JOSHUA-304
> Project: Joshua
>  Issue Type: Bug
>  Components: alignment, berkeley, templates
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> It takes me quite some time to debug what was going on and why pipeline's 
> were failing when using the berkeley aligner.
> It turns out that the word-align.conf template provided at
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf
> is not compatible with the berkeley aligner. 
> In particular the following lines are non compatible
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf#L12-L15
> Evidence of this is provided below
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1, HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'JOINT JOINT'; valid choices: FORWARD|REVERSE|BOTH_INDEP|JOINT
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Exception in thread "main" java.lang.NumberFormatException: For input string: 
> "5 5"
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Integer.parseInt(Integer.java:580)
>   at java.lang.Integer.parseInt(Integer.java:615)
>   at 
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:143)
>   at 
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:240)
>   at edu.berkeley.nlp.fig.basic.OptInfo.set(OptionsParser.java:294)
>   at 
> edu.berkeley.nlp.fig.basic.OptionsParser.readOptionsFile(OptionsParser.java:555)
>   at 
> edu.berkeley.nlp.fig.basic.OptionsParser.doParse(OptionsParser.java:604)
>   at edu.berkeley.nlp.fig.exec.Execution.init(Execution.java:293)
>   at edu.berkeley.nlp.wordAlignment.Main.main(Main.java:149)
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Cannot create directory: alignments/0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-304) word-align.conf alignment template file not compatible with berkeley aligner

2016-08-24 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435133#comment-15435133
 ] 

Lewis John McGibbney commented on JOSHUA-304:
-

It may help for me to post the options available within the current berkeley 
aligner jar which was built when I installed Joshua
{code}
lmcgibbn@LMC-032857 /usr/local/incubator-joshua(master) $ java -jar 
./lib/berkeleyaligner.jar  -help
Usage:
  log.maxIndLevel<  int> : Maximum indent level. [10]
  log.msPerLine  <  int> : Maximum number of milliseconds 
between consecutive lines of output. [1000]
  log.file   <  str> : File to write log. []
  log.stdout < bool> : Whether to output to the console. 
[true]
  log.note   <  str> : Dummy placeholder for a comment []
  log.forcePrint < bool> : Force printing from logs* [false]
  log.maxPrintErrors <  int> : Maximum number of errors (via 
error()) to print [1]
  EMWordAligner.nullProb <  dbl> : How to assign null-word 
probabilities (=1 means 1/n) [1.0E-6]
  EMWordAligner.usePosteriorDecoding < bool> : Use posterior decoding 
(recommended for best performance). [true]
  EMWordAligner.posteriorDecodingThreshold <  dbl> : Threshold in [0,1] for 
deciding whether an alignment should exist. [0.5]
  EMWordAligner.mergeConsiderNull < bool> : When merging expected sufficient 
statistics, take into account the NULL (fix). [false]
  EMWordAligner.handleUnknownWords < bool> : Don't crash with unknown words 
(better to train on test set). [false]
  EMWordAligner.priorFraction<  dbl> : Fraction of a count to add for links 
in dictionary prior (1 works well). [0.0]
  EMWordAligner.numThreads   <  int> : Number of concurrent threads to use 
during E-step (set to number of processors). [1]
  EMWordAligner.safeConcurrency  < bool> : Safe concurrency (gets rid of 
concurrency warnings at the expense of speed) [false]
  EMWordAligner.evaluateDuringTraining < bool> : Whether to evaluate the model 
after each training iteration (slower, more memory). [false]
  TreeWalkModel.usePushProbabilities < bool> : Separate parameters for moving 
and pushing. [true]
  TreeWalkModel.conditionOnTag   < bool> : Whether to condition distortion on 
the tag types. [true]
  TreeWalkModel.cacheTreePaths   < bool> : Whether to cache paths through trees 
(uses lots of memory; faster). [false]
  Evaluator.searchForThreshold   < bool> : Evaluate using line search [false]
  Evaluator.thresholdIntervals   <  int> : Sets the number of intervals for 
posterior threshold line search [20]
  Evaluator.saveAlignmentObjects < bool> : Save object files for proposed 
alignments (large files) [false]
  Main.trainSources  < str*> : Directories or files containing 
training files. [example/train]
  Main.testSources   < str*> : Directory or file containing testing 
files. [example/test]
  Main.sentences <  int> : Maximum number of the training 
sentences to use [2147483647]
  Main.offsetTrainingSentences   <  int> : Skip this number of the first 
training sentences [0]
  Main.maxTestSentences  <  int> : Maximum number of the test sentences 
to use [2147483647]
  Main.offsetTestSentences   <  int> : Skip this number of the first test 
sentences [0]
  Main.foreignSuffix <  str> : Foreign language file suffix [f]
  Main.englishSuffix <  str> : English language file suffix [e]
  Main.itgTrainTestSplitPoint<  int> : When writing test (ITG) posteriors, 
where to divide train/test data? [0]
  Main.itgInputDir   <  str> : What directory should we dump ITG 
test data to? []
  Main.reverseAlignments < bool> : Reverse test set alignments (i.e., 
foreign to english) [false]
  Main.oneIndexed< bool> : Are alignments one-indexed (default 
== no, 0-indexed) [false]
  Main.lowercaseWords< bool> : Convert all words to lowercase 
[false]
  Main.leaveTrainingOnDisk   < bool> : Don't load and store the training 
set upfront (slower, but less memory) [false]
  Main.saveRejects   < bool> : Save rejected sentence pairs [false]
  Main.forwardModels  : Which word alignment model to use in 
the forward direction. [MODEL1 HMM]
  Main.reverseModels  : Which word alignment model to use in 
the backward direction. [MODEL1 HMM]
  Main.iters < int*> : Number of iterations to run the 
model. [5 5]
  Main.mode   : Whether to train the two models 
jointly or independently. [JOINT JOINT]
  Main.trainingCacheMaxSize  <  int> : Max sentence length for caching the 
HMM trellis (efficiency only). [100]
  Main.loadParamsDir <  str> : Directory to load parameters from. []
  Main.loadLexicalModelOnly  < bool> : When true, the 

[jira] [Commented] (JOSHUA-304) word-align.conf alignment template file not compatible with berkeley aligner

2016-08-23 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434164#comment-15434164
 ] 

Lewis John McGibbney commented on JOSHUA-304:
-

It should be noted that in order for me to override the exceptions thrown above 
the template ended up looking like the following
{code}
## word-align.conf
## --
## This is an example training script for the Berkeley
## word aligner.  In this configuration it uses two HMM
## alignment models trained jointly and then decoded 
## using the competitive thresholding heuristic.

##
# Training: Defines the training regimen 
##

forwardModels   HMM
reverseModels   HMM
modeJOINT
iters   5

###
# Execution: Controls output and program flow 
###

execDir alignments/0
create
saveParams  false
numThreads  1
msPerLine   1
alignTraining

#
# Language/Data 
#

foreignSuffix   es.0
englishSuffix   en.0

# Choose the training sources, which can either be directories or files that 
list files/directories
trainSources 
/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/data/train/splits/corpus
sentencesMAX
testSources /dev/null
overwriteExecDir true

#
# 1-best output 
#

competitiveThresholding

{code}

> word-align.conf alignment template file not compatible with berkeley aligner
> 
>
> Key: JOSHUA-304
> URL: https://issues.apache.org/jira/browse/JOSHUA-304
> Project: Joshua
>  Issue Type: Bug
>  Components: alignment, berkeley, templates
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> It takes me quite some time to debug what was going on and why pipeline's 
> were failing when using the berkeley aligner.
> It turns out that the word-align.conf template provided at
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf
> is not compatible with the berkeley aligner. 
> In particular the following lines are non compatible
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf#L12-L15
> Evidence of this is provided below
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1, HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'JOINT JOINT'; valid choices: FORWARD|REVERSE|BOTH_INDEP|JOINT
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Exception in thread "main" java.lang.NumberFormatException: For input string: 
> "5 5"
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Integer.parseInt(Integer.java:580)
>   at java.lang.Integer.parseInt(Integer.java:615)
>   at 
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:143)
>   at 
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:240)
>   at edu.berkeley.nlp.fig.basic.OptInfo.set(OptionsParser.java:294)
>   at 
> edu.berkeley.nlp.fig.basic.OptionsParser.readOptionsFile(OptionsParser.java:555)
>   at 
> edu.berkeley.nlp.fig.basic.OptionsParser.doParse(OptionsParser.java:604)
>   at 

[jira] [Commented] (JOSHUA-299) Move regression tests to proper unit tests

2016-08-22 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432062#comment-15432062
 ] 

Lewis John McGibbney commented on JOSHUA-299:
-

I'll scope this issue tomorrow [~post] and see if I can get a PR together.

> Move regression tests to proper unit tests
> --
>
> Key: JOSHUA-299
> URL: https://issues.apache.org/jira/browse/JOSHUA-299
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
>
> Many of the regression tests (test*.sh under src/test/resources) have been 
> moved to proper unit tests, but this move should be completed, and the 
> regression tests should be deleted. This should be done for 6.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (JOSHUA-299) Move regression tests to proper unit tests

2016-08-22 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reassigned JOSHUA-299:
---

Assignee: Lewis John McGibbney

> Move regression tests to proper unit tests
> --
>
> Key: JOSHUA-299
> URL: https://issues.apache.org/jira/browse/JOSHUA-299
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
>
> Many of the regression tests (test*.sh under src/test/resources) have been 
> moved to proper unit tests, but this move should be completed, and the 
> regression tests should be deleted. This should be done for 6.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-287) KenLM.java catches UnsatisfiedLinkError when attempting to load libken.so (libken.dylib on OSX)

2016-08-13 Thread lewis john mcgibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420151#comment-15420151
 ] 

lewis john mcgibbney commented on JOSHUA-287:
-

Brilliant Kellen thank you




> KenLM.java catches UnsatisfiedLinkError when attempting to load libken.so 
> (libken.dylib on OSX)
> ---
>
> Key: JOSHUA-287
> URL: https://issues.apache.org/jira/browse/JOSHUA-287
> Project: Joshua
>  Issue Type: Bug
>  Components: core, kenlm
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Assignee: Kellen Sunderland
> Fix For: 6.1
>
>
> As explained in 
> http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg01189.html 
> currently we have an issue, where, when checked out from master the following 
> RuntimeException is thrown.
> {code}
> ---
>  T E S T S
> ---
> Running TestSuite
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> tm_pt_0=-2.000 tm_glue_0=3.000 lm_0=-206.718 lm_0_oov=2.000 
> OOVPenalty=-200.000 | -198.000
> ERROR - * FATAL: Can't find libken.so (libken.dylib on OS X) in $JOSHUA/lib
> ERROR - *This probably means that the KenLM library didn't compile.
> ERROR - *Make sure that BOOST_ROOT is set to the root of your boost
> ERROR - *installation (it's not /opt/local/, the default), change to
> ERROR - *$JOSHUA, and type 'ant kenlm'. If problems persist, see the
> ERROR - *website (joshua-decoder.org).
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> {code}
> We need to fix this such that we can run static source code analysis via 
> sonar and have our results available on analysis.apache.org.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-249) Joshua Logo

2016-08-13 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420149#comment-15420149
 ] 

Lewis John McGibbney commented on JOSHUA-249:
-

Cool can you please resolve this issue.




> Joshua Logo
> ---
>
> Key: JOSHUA-249
> URL: https://issues.apache.org/jira/browse/JOSHUA-249
> Project: Joshua
>  Issue Type: Task
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Minor
> Fix For: 6.1
>
> Attachments: apache_joshua_logo.png, apache_joshua_logo.xcf
>
>
> As we discussed on the mailing lists, this issue should gather all proposed 
> Joshua logo's so we can VOTE on one or more of them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-287) KenLM.java catches UnsatisfiedLinkError when attempting to load libken.so (libken.dylib on OSX)

2016-07-27 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-287:

Issue Type: Bug  (was: Improvement)

> KenLM.java catches UnsatisfiedLinkError when attempting to load libken.so 
> (libken.dylib on OSX)
> ---
>
> Key: JOSHUA-287
> URL: https://issues.apache.org/jira/browse/JOSHUA-287
> Project: Joshua
>  Issue Type: Bug
>  Components: core, kenlm
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
> Fix For: 6.1
>
>
> As explained in 
> http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg01189.html 
> currently we have an issue, where, when checked out from master the following 
> RuntimeException is thrown.
> {code}
> ---
>  T E S T S
> ---
> Running TestSuite
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> tm_pt_0=-2.000 tm_glue_0=3.000 lm_0=-206.718 lm_0_oov=2.000 
> OOVPenalty=-200.000 | -198.000
> ERROR - * FATAL: Can't find libken.so (libken.dylib on OS X) in $JOSHUA/lib
> ERROR - *This probably means that the KenLM library didn't compile.
> ERROR - *Make sure that BOOST_ROOT is set to the root of your boost
> ERROR - *installation (it's not /opt/local/, the default), change to
> ERROR - *$JOSHUA, and type 'ant kenlm'. If problems persist, see the
> ERROR - *website (joshua-decoder.org).
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> {code}
> We need to fix this such that we can run static source code analysis via 
> sonar and have our results available on analysis.apache.org.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-283) Implement fast_align as one of the available alignment options

2016-07-20 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-283:
---

 Summary: Implement fast_align as one of the available alignment 
options
 Key: JOSHUA-283
 URL: https://issues.apache.org/jira/browse/JOSHUA-283
 Project: Joshua
  Issue Type: Bug
  Components: alignment, pipeline
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: 6.1


For some time now, I've been having issues using GIZA++ for alignment whilst 
running a Joshua pipeline.
Whilst looking for an alternative [~post] and [~kellen.sunderland] mentioned 
the berkeley aligner and fast_align respectively.
Due to the fact that 1) berkeley aligner has not been touched in ~9 years, and 
2) no artifact currently exists on Maven Central, I am taking the advice and 
attempting to use fast_align.
This issue will augment the alignment code in Joshua to permit use of 
fast_align which is ALv2.0 licensed.

https://github.com/clab/fast_align 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (JOSHUA-281) split2files.pl support script no longer exists hence pipeline fails

2016-07-15 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney closed JOSHUA-281.
---
Resolution: Invalid

This is not a bug at all, my input parameters for the pipeline.pl invocation 
were incorrect.

> split2files.pl support script no longer exists hence pipeline fails
> ---
>
> Key: JOSHUA-281
> URL: https://issues.apache.org/jira/browse/JOSHUA-281
> Project: Joshua
>  Issue Type: Bug
>  Components: pipeline
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> When I attempt to run a pipeline, I get the following
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua(master) $ ../bin/pipeline.pl  
> --rundir . --type hiero --corpus 
> /usr/local/jpl/xdata/joshua_experiments/russian_model/commoncrawl.ru-en 
> --tune 
> /usr/local/jpl/xdata/joshua_experiments/russian_model/commoncrawl.ru-en.tune 
> --test 
> /usr/local/jpl/xdata/joshua_experiments/russian_model/commoncrawl.ru-en.test 
> --source en --target ru --rundir experiment_1/1 --readme "Russian model 
> generation experiment 1 run 1" --mbr
> [train-copy-and-filter] rebuilding...
>   
> dep=/usr/local/jpl/xdata/joshua_experiments/russian_model/commoncrawl.ru-en.en
>  [CHANGED]
>   
> dep=/usr/local/jpl/xdata/joshua_experiments/russian_model/commoncrawl.ru-en.ru
>  [CHANGED]
>   dep=/usr/local/incubator-joshua/experiment_1/1/data/train/train.en [NOT 
> FOUND]
>   dep=/usr/local/incubator-joshua/experiment_1/1/data/train/train.ru [NOT 
> FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/training/paste 
> /usr/local/jpl/xdata/joshua_experiments/russian_model/commoncrawl.ru-en.en 
> /usr/local/jpl/xdata/joshua_experiments/russian_model/commoncrawl.ru-en.ru | 
> /usr/local/incubator-joshua/scripts/training/filter-empty-lines.pl | 
> /usr/local/incubator-joshua/scripts/training/split2files.pl 
> /usr/local/incubator-joshua/experiment_1/1/data/train/train.en 
> /usr/local/incubator-joshua/experiment_1/1/data/train/train.ru
>   JOB FAILED (return code 127)
> /bin/bash: /usr/local/incubator-joshua/scripts/training/split2files.pl: No 
> such file or directory
> {code}
> The following commit changed the name of the file
> {code}
> Repository: incubator-joshua
> Updated Branches:
>   refs/heads/master 09fb6a2d3 -> f02bd279e
> combined split2files implementations
> Project: http://git-wip-us.apache.org/repos/asf/incubator-joshua/repo
> Commit: 
> http://git-wip-us.apache.org/repos/asf/incubator-joshua/commit/f02bd279
> Tree: http://git-wip-us.apache.org/repos/asf/incubator-joshua/tree/f02bd279
> Diff: http://git-wip-us.apache.org/repos/asf/incubator-joshua/diff/f02bd279
> Branch: refs/heads/master
> Commit: f02bd279e892408c9eca2a2a241f21f59cb105e9
> Parents: 09fb6a2
> Author: Matt Post 
> Authored: Wed May 18 09:12:07 2016 -0400
> Committer: Matt Post 
> Committed: Wed May 18 09:12:07 2016 -0400
> --
>  scripts/support/split2files  | 44 +++
>  scripts/support/splittabs.pl | 42 -
>  scripts/training/pipeline.pl |  8 ++---
>  scripts/training/split2files.pl  | 38 ---
>  scripts/training/trim_parallel_corpus.pl |  2 +-
>  5 files changed, 49 insertions(+), 85 deletions(-)
> --
> {code}
> I'll submit a PR to do the simple string replace... which is hopefully all 
> that is wrong here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-280) Existing Language packs not compatible with Joshua master

2016-07-01 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359696#comment-15359696
 ] 

Lewis John McGibbney commented on JOSHUA-280:
-

The existing chinese language pack works just fine
{code}
lmcgibbn@LMC-032857 
/usr/local/Cellar/joshua/HEAD/libexec/zh-en-hiero-pack-2016-01(NUTCH-2089) $ 
./run-joshua-server.sh
Parameters read from configuration file:
tm = 'thrax -path grammar.packed -maxspan 20 -owner pt'
tm = 'thrax -path grammar.glue -maxspan -1 -owner glue'
defaultnonterminal = 'X'
goalsymbol = 'GOAL'
featurefunction = 'LanguageModel -lm_order 5 -lm_type berkeleylm -lm_file 
lm.berkeleylm'
markoovs = 'false'
search = 'cky'
poplimit = '100'
topn = '0'
useuniquenbest = 'true'
outputformat = '%S'
includealignindex = 'false'
featurefunction = 'OOVPenalty'
featurefunction = 'WordPenalty'
Parameters overridden from the command line:
server-port: 5674
serverport = '5674'
c = 'joshua.config'
Read 10 weights (0 of them dense)
Reading vocabulary: grammar.packed/vocabulary
Read 300317 entries from the vocabulary
Reading packed config: grammar.packed/config
102030405060708090.100%
Reading encoder configuration: grammar.packed/encoding
Loaded 62685418 rules
Reading grammar from file grammar.glue...
MemoryBasedBatchGrammar: Read 4 rules with 4 distinct source sides from 
'grammar.glue'
Memory used 3447.1 MB
Grammar loading took: 39 seconds.
Stateful object with state index 0
Loading Berkeley LM from binary lm.berkeleylm
FEATURE: tm_pt (weight 0.000)
FEATURE: tm_glue (weight 0.000)
FEATURE: lm_0, order 5 (weight 0.194)
FEATURE: OOVPenalty (weight 0.015)
FEATURE: WordPenalty (weight -0.460)
Grammar sorting happening lazily on-demand.
Model loading took 42 seconds
Memory used 4355.5 MB
** TCP Server running and listening on port 5674.
{code}

> Existing Language packs not compatible with Joshua master
> -
>
> Key: JOSHUA-280
> URL: https://issues.apache.org/jira/browse/JOSHUA-280
> Project: Joshua
>  Issue Type: Bug
>  Components: language packs
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Critical
> Fix For: 6.1
>
>
> When I work with the existing Spanish --> English language pack at 
> http://cs.jhu.edu/~post/language-packs/language-pack-es-en-phrase-2015-03-06.tgz,
>  I get the following error
> {code}
> lmcgibbn@LMC-032857 
> /usr/local/Cellar/joshua/HEAD/libexec/language-pack-es-en-phrase-2015-03-06(NUTCH-2089)
>  $ ./run-joshua-server.sh
> INFO - Parameters read from configuration file: joshua.config
> INFO - tm = 'moses -owner pt -maxspan 0 -path phrase-table.packed 
> -max-source-len 5'
> INFO - defaultnonterminal = 'X'
> INFO - goalsymbol = 'GOAL'
> INFO - featurefunction = 'StateMinimizingLanguageModel -lm_type kenlm 
> -lm_order 5 -lm_file lm.kenlm'
> INFO - markoovs = 'false'
> INFO - search = 'stack'
> INFO - pop-limit: 100
> INFO - poplimit = '100'
> INFO - topn = '0'
> INFO - useuniquenbest = 'true'
> INFO - outputformat = '%s'
> INFO - includealignindex = 'false'
> INFO - featurefunction = 'OOVPenalty'
> INFO - featurefunction = 'WordPenalty'
> INFO - featurefunction = 'Distortion'
> INFO - featurefunction = 'PhrasePenalty'
> INFO - c = 'joshua.config'
> INFO - server-port: 5674
> INFO - serverport = '5674'
> INFO - Read 9 weights (0 of them dense)
> INFO - Reading vocabulary: phrase-table.packed/vocabulary
> INFO - Read 191983 entries from the vocabulary
> INFO - Reading packed config: phrase-table.packed/config
> 102030405060708090.100%
> Exception in thread "main" java.lang.RuntimeException: The grammar at 
> phrase-table.packed was packed with packer version 0, but the earliest 
> supported version is 3
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar.readConfig(PackedGrammar.java:1061)
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar.(PackedGrammar.java:143)
>   at 
> org.apache.joshua.decoder.phrase.PhraseTable.(PhraseTable.java:65)
>   at 
> org.apache.joshua.decoder.Decoder.initializeTranslationGrammars(Decoder.java:603)
>   at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:514)
>   at org.apache.joshua.decoder.Decoder.(Decoder.java:126)
>   at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:69)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-280) Existing Spanish --> English Language pack not compatible with Joshua master

2016-07-01 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359690#comment-15359690
 ] 

Lewis John McGibbney commented on JOSHUA-280:
-

[~post] any idea whats up here? Thanks

> Existing Spanish --> English Language pack not compatible with Joshua master
> 
>
> Key: JOSHUA-280
> URL: https://issues.apache.org/jira/browse/JOSHUA-280
> Project: Joshua
>  Issue Type: Bug
>  Components: language packs
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Critical
> Fix For: 6.1
>
>
> When I work with the existing Spanish --> English language pack at 
> http://cs.jhu.edu/~post/language-packs/language-pack-es-en-phrase-2015-03-06.tgz,
>  I get the following error
> {code}
> lmcgibbn@LMC-032857 
> /usr/local/Cellar/joshua/HEAD/libexec/language-pack-es-en-phrase-2015-03-06(NUTCH-2089)
>  $ ./run-joshua-server.sh
> INFO - Parameters read from configuration file: joshua.config
> INFO - tm = 'moses -owner pt -maxspan 0 -path phrase-table.packed 
> -max-source-len 5'
> INFO - defaultnonterminal = 'X'
> INFO - goalsymbol = 'GOAL'
> INFO - featurefunction = 'StateMinimizingLanguageModel -lm_type kenlm 
> -lm_order 5 -lm_file lm.kenlm'
> INFO - markoovs = 'false'
> INFO - search = 'stack'
> INFO - pop-limit: 100
> INFO - poplimit = '100'
> INFO - topn = '0'
> INFO - useuniquenbest = 'true'
> INFO - outputformat = '%s'
> INFO - includealignindex = 'false'
> INFO - featurefunction = 'OOVPenalty'
> INFO - featurefunction = 'WordPenalty'
> INFO - featurefunction = 'Distortion'
> INFO - featurefunction = 'PhrasePenalty'
> INFO - c = 'joshua.config'
> INFO - server-port: 5674
> INFO - serverport = '5674'
> INFO - Read 9 weights (0 of them dense)
> INFO - Reading vocabulary: phrase-table.packed/vocabulary
> INFO - Read 191983 entries from the vocabulary
> INFO - Reading packed config: phrase-table.packed/config
> 102030405060708090.100%
> Exception in thread "main" java.lang.RuntimeException: The grammar at 
> phrase-table.packed was packed with packer version 0, but the earliest 
> supported version is 3
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar.readConfig(PackedGrammar.java:1061)
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar.(PackedGrammar.java:143)
>   at 
> org.apache.joshua.decoder.phrase.PhraseTable.(PhraseTable.java:65)
>   at 
> org.apache.joshua.decoder.Decoder.initializeTranslationGrammars(Decoder.java:603)
>   at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:514)
>   at org.apache.joshua.decoder.Decoder.(Decoder.java:126)
>   at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:69)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (JOSHUA-279) Cannot build Joshua master branch

2016-07-01 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reassigned JOSHUA-279:
---

Assignee: Lewis John McGibbney

> Cannot build Joshua master branch
> -
>
> Key: JOSHUA-279
> URL: https://issues.apache.org/jira/browse/JOSHUA-279
> Project: Joshua
>  Issue Type: Bug
>  Components: build, documentation, tests
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> Hi Folks,
> We need to be cautious of whatever is committed to master branch... the build 
> has been broken for quite some time and there are constant Javadoc issues 
> which make the build unstable as well.
> For example, when i make an attempt to build master branch we have failing 
> tests
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua(master) $ mvn clean install
> ...
> ---
>  T E S T S
> ---
> Running TestSuite
> tm_pt_0=-2.000 tm_glue_0=3.000 lm_0=-206.718 lm_0_oov=2.000 
> OOVPenalty=-200.000 | -198.000
> ERROR - * FATAL: Can't find libken.so (libken.dylib on OS X) in $JOSHUA/lib
> ERROR - *This probably means that the KenLM library didn't compile.
> ERROR - *Make sure that BOOST_ROOT is set to the root of your boost
> ERROR - *installation (it's not /opt/local/, the default), change to
> ERROR - *$JOSHUA, and type 'ant kenlm'. If problems persist, see the
> ERROR - *website (joshua-decoder.org).
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> %
> %
> %
> %
> %
> %
> %
> %
> %
> Tests run: 126, Failures: 1, Errors: 0, Skipped: 6, Time elapsed: 1.818 sec 
> <<< FAILURE! - in TestSuite
> setUp(org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest)  
> Time elapsed: 0.075 sec  <<< FAILURE!
> java.lang.ExceptionInInitializerError
>   at 
> org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(ClassBasedLanguageModelTest.java:52)
> Caused by: java.lang.RuntimeException: java.lang.UnsatisfiedLinkError: no ken 
> in java.library.path
>   at 
> org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(ClassBasedLanguageModelTest.java:52)
> Caused by: java.lang.UnsatisfiedLinkError: no ken in java.library.path
>   at 
> org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(ClassBasedLanguageModelTest.java:52)
> Results :
> Failed tests:
> org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest)
>   Run 1: ClassBasedLanguageModelTest.setUp:52 » ExceptionInInitializer
>   Run 2: PASS
> Tests run: 124, Failures: 1, Errors: 0, Skipped: 4
> [INFO] 
> 
> [INFO] BUILD FAILURE
> {code}
> As a workaround I thought I will try to build the project without running the 
> test suite, however now Javadoc issues prevent me from doing so!
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua(master) $ mvn clean install 
> -DskipTests
> ...
> 1 error
> 14 warnings
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 28.144 s
> [INFO] Finished at: 2016-07-01T14:11:42-07:00
> [INFO] Final Memory: 37M/303M
> [INFO] 
> 
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-javadoc-plugin:2.8:jar (attach-javadocs) on 
> project joshua: MavenReportException: Error while creating archive:
> [ERROR] Exit code: 1 - 
> /usr/local/incubator-joshua/src/main/java/org/apache/joshua/decoder/ff/lm/LanguageModelFF.java:217:
>  warning: no @param for rule
> [ERROR] public int[] getRuleIds(final Rule rule) {
> [ERROR] ^
> [ERROR] 
> /usr/local/incubator-joshua/src/main/java/org/apache/joshua/decoder/ff/lm/LanguageModelFF.java:217:
>  warning: no 

[jira] [Commented] (JOSHUA-279) Cannot build Joshua master branch

2016-07-01 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359676#comment-15359676
 ] 

Lewis John McGibbney commented on JOSHUA-279:
-

commit 342312e309ec1bb9b1074688c1fbd3897783bc49
Author: Lewis John McGibbney 
Date:   Fri Jul 1 14:40:44 2016 -0700

JOSHUA-279 Cannot build Joshua master branch

The above commit fixes the Javadoc and I can now build. The test suite is still 
failing so I am still building with the -DskipTests flag

> Cannot build Joshua master branch
> -
>
> Key: JOSHUA-279
> URL: https://issues.apache.org/jira/browse/JOSHUA-279
> Project: Joshua
>  Issue Type: Bug
>  Components: build, documentation, tests
>Reporter: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> Hi Folks,
> We need to be cautious of whatever is committed to master branch... the build 
> has been broken for quite some time and there are constant Javadoc issues 
> which make the build unstable as well.
> For example, when i make an attempt to build master branch we have failing 
> tests
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua(master) $ mvn clean install
> ...
> ---
>  T E S T S
> ---
> Running TestSuite
> tm_pt_0=-2.000 tm_glue_0=3.000 lm_0=-206.718 lm_0_oov=2.000 
> OOVPenalty=-200.000 | -198.000
> ERROR - * FATAL: Can't find libken.so (libken.dylib on OS X) in $JOSHUA/lib
> ERROR - *This probably means that the KenLM library didn't compile.
> ERROR - *Make sure that BOOST_ROOT is set to the root of your boost
> ERROR - *installation (it's not /opt/local/, the default), change to
> ERROR - *$JOSHUA, and type 'ant kenlm'. If problems persist, see the
> ERROR - *website (joshua-decoder.org).
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> %
> %
> %
> %
> %
> %
> %
> %
> %
> Tests run: 126, Failures: 1, Errors: 0, Skipped: 6, Time elapsed: 1.818 sec 
> <<< FAILURE! - in TestSuite
> setUp(org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest)  
> Time elapsed: 0.075 sec  <<< FAILURE!
> java.lang.ExceptionInInitializerError
>   at 
> org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(ClassBasedLanguageModelTest.java:52)
> Caused by: java.lang.RuntimeException: java.lang.UnsatisfiedLinkError: no ken 
> in java.library.path
>   at 
> org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(ClassBasedLanguageModelTest.java:52)
> Caused by: java.lang.UnsatisfiedLinkError: no ken in java.library.path
>   at 
> org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(ClassBasedLanguageModelTest.java:52)
> Results :
> Failed tests:
> org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest)
>   Run 1: ClassBasedLanguageModelTest.setUp:52 » ExceptionInInitializer
>   Run 2: PASS
> Tests run: 124, Failures: 1, Errors: 0, Skipped: 4
> [INFO] 
> 
> [INFO] BUILD FAILURE
> {code}
> As a workaround I thought I will try to build the project without running the 
> test suite, however now Javadoc issues prevent me from doing so!
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua(master) $ mvn clean install 
> -DskipTests
> ...
> 1 error
> 14 warnings
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 28.144 s
> [INFO] Finished at: 2016-07-01T14:11:42-07:00
> [INFO] Final Memory: 37M/303M
> [INFO] 
> 
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-javadoc-plugin:2.8:jar (attach-javadocs) on 
> project joshua: MavenReportException: Error while creating archive:
> [ERROR] Exit code: 1 - 
> 

[jira] [Created] (JOSHUA-279) Cannot build Joshua master branch

2016-07-01 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-279:
---

 Summary: Cannot build Joshua master branch
 Key: JOSHUA-279
 URL: https://issues.apache.org/jira/browse/JOSHUA-279
 Project: Joshua
  Issue Type: Bug
  Components: tests, build, documentation
Reporter: Lewis John McGibbney
Priority: Blocker
 Fix For: 6.1


Hi Folks,
We need to be cautious of whatever is committed to master branch... the build 
has been broken for quite some time and there are constant Javadoc issues which 
make the build unstable as well.
For example, when i make an attempt to build master branch we have failing tests
{code}
lmcgibbn@LMC-032857 /usr/local/incubator-joshua(master) $ mvn clean install
...
---
 T E S T S
---
Running TestSuite
tm_pt_0=-2.000 tm_glue_0=3.000 lm_0=-206.718 lm_0_oov=2.000 OOVPenalty=-200.000 
| -198.000
ERROR - * FATAL: Can't find libken.so (libken.dylib on OS X) in $JOSHUA/lib
ERROR - *This probably means that the KenLM library didn't compile.
ERROR - *Make sure that BOOST_ROOT is set to the root of your boost
ERROR - *installation (it's not /opt/local/, the default), change to
ERROR - *$JOSHUA, and type 'ant kenlm'. If problems persist, see the
ERROR - *website (joshua-decoder.org).
WARN - sentence 0 too long 401, truncating to length 200
WARN - sentence 0 too long 401, truncating to length 200
WARN - sentence 0 too long 401, truncating to length 200
WARN - sentence 0 too long 401, truncating to length 200
WARN - no grammars supplied!  Supplying dummy glue grammar.
WARN - no grammars supplied!  Supplying dummy glue grammar.
WARN - no grammars supplied!  Supplying dummy glue grammar.
WARN - no grammars supplied!  Supplying dummy glue grammar.
WARN - no grammars supplied!  Supplying dummy glue grammar.
WARN - no grammars supplied!  Supplying dummy glue grammar.
WARN - no grammars supplied!  Supplying dummy glue grammar.
WARN - no grammars supplied!  Supplying dummy glue grammar.
%
%
%
%
%
%
%
%
%
Tests run: 126, Failures: 1, Errors: 0, Skipped: 6, Time elapsed: 1.818 sec <<< 
FAILURE! - in TestSuite
setUp(org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest)  
Time elapsed: 0.075 sec  <<< FAILURE!
java.lang.ExceptionInInitializerError
at 
org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(ClassBasedLanguageModelTest.java:52)
Caused by: java.lang.RuntimeException: java.lang.UnsatisfiedLinkError: no ken 
in java.library.path
at 
org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(ClassBasedLanguageModelTest.java:52)
Caused by: java.lang.UnsatisfiedLinkError: no ken in java.library.path
at 
org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(ClassBasedLanguageModelTest.java:52)


Results :

Failed tests:
org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest)
  Run 1: ClassBasedLanguageModelTest.setUp:52 » ExceptionInInitializer
  Run 2: PASS


Tests run: 124, Failures: 1, Errors: 0, Skipped: 4

[INFO] 
[INFO] BUILD FAILURE
{code}

As a workaround I thought I will try to build the project without running the 
test suite, however now Javadoc issues prevent me from doing so!

{code}
lmcgibbn@LMC-032857 /usr/local/incubator-joshua(master) $ mvn clean install 
-DskipTests
...
1 error
14 warnings
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 28.144 s
[INFO] Finished at: 2016-07-01T14:11:42-07:00
[INFO] Final Memory: 37M/303M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-javadoc-plugin:2.8:jar (attach-javadocs) on 
project joshua: MavenReportException: Error while creating archive:
[ERROR] Exit code: 1 - 
/usr/local/incubator-joshua/src/main/java/org/apache/joshua/decoder/ff/lm/LanguageModelFF.java:217:
 warning: no @param for rule
[ERROR] public int[] getRuleIds(final Rule rule) {
[ERROR] ^
[ERROR] 
/usr/local/incubator-joshua/src/main/java/org/apache/joshua/decoder/ff/lm/LanguageModelFF.java:217:
 warning: no @return
[ERROR] public int[] getRuleIds(final Rule rule) {
[ERROR] ^
[ERROR] 
/usr/local/incubator-joshua/src/main/java/org/apache/joshua/decoder/ff/lm/LanguageModelFF.java:231:
 warning: no @param for words
[ERROR] public int getOovs(final int[] words) {
[ERROR] ^
[ERROR] 
/usr/local/incubator-joshua/src/main/java/org/apache/joshua/decoder/ff/lm/LanguageModelFF.java:231:
 warning: no @return
[ERROR] public int 

[jira] [Resolved] (JOSHUA-269) Fix Javadoc in JOSHUA-252 branch to comply with JDK1.8 Spec

2016-06-20 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved JOSHUA-269.
-
Resolution: Fixed

> Fix Javadoc in JOSHUA-252 branch to comply with JDK1.8 Spec
> ---
>
> Key: JOSHUA-269
> URL: https://issues.apache.org/jira/browse/JOSHUA-269
> Project: Joshua
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> When we build the JOSHUA-252 codebase on Jira, we get the following
> {code}
> [INFO] 
> 
> [ERROR] BUILD ERROR
> [INFO] 
> 
> [INFO] An error has occurred in JavaDocs report generation: 
> Exit code: 1 - 
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/oracle/OracleExtractionHG.java:629:
>  warning: no @param for tbl
>   public void get_ngrams(HashMap tbl, int order, 
> ArrayList wrds,
>   ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/oracle/OracleExtractionHG.java:629:
>  warning: no @param for order
>   public void get_ngrams(HashMap tbl, int order, 
> ArrayList wrds,
>   ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/oracle/OracleExtractionHG.java:629:
>  warning: no @param for wrds
>   public void get_ngrams(HashMap tbl, int order, 
> ArrayList wrds,
>   ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/oracle/OracleExtractionHG.java:629:
>  warning: no @param for ignore_null_equiv_symbol
>   public void get_ngrams(HashMap tbl, int order, 
> ArrayList wrds,
>   ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/oracle/OracleExtractionHG.java:45:
>  error: malformed HTML
>  * @author Zhifei Li,  (Johns Hopkins University)
>   ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/oracle/OracleExtractionHG.java:45:
>  error: bad use of '>'
>  * @author Zhifei Li,  (Johns Hopkins University)
> ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/oracle/OracleExtractionHG.java:91:
>  warning: no description for @param
>* @param lm_feat_id_
>  ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/oracle/SplitHg.java:33:
>  error: malformed HTML
>  * @author Zhifei Li,  (Johns Hopkins University)
>   ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/oracle/SplitHg.java:33:
>  error: bad use of '>'
>  * @author Zhifei Li,  (Johns Hopkins University)
> ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/ui/tree_visualizer/browser/Browser.java:77:
>  error: @param name not found
>* @param args the paths to the source, reference, and n-best files
> ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/ui/tree_visualizer/browser/Browser.java:79:
>  warning: no @param for argv
>   public static void main(String[] argv) throws IOException {
>  ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/ui/tree_visualizer/browser/Browser.java:79:
>  warning: no @throws for java.io.IOException
>   public static void main(String[] argv) throws IOException {
>  ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/ui/tree_visualizer/tree/Tree.java:165:
>  warning: no @return
>   public int size() {
>  ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/ui/tree_visualizer/tree/Tree.java:172:
>  warning: no @return
>   public Node root() {
>   ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/ui/tree_visualizer/tree/Tree.java:51:
>  error: malformed HTML
>  * @author Jonny Weese 
>^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/ui/tree_visualizer/tree/Tree.java:51:
>  error: bad use of '>'
>  * @author Jonny Weese 
> ^
> 

[jira] [Updated] (JOSHUA-275) Revamp the Configuration System

2016-06-20 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-275:

Fix Version/s: (was: 6,2)
   6.2

> Revamp the Configuration System
> ---
>
> Key: JOSHUA-275
> URL: https://issues.apache.org/jira/browse/JOSHUA-275
> Project: Joshua
>  Issue Type: Improvement
>Affects Versions: 6.1, 6.2, 7
>Reporter: Kellen Sunderland
> Fix For: 6.2
>
>
> I'd like to propose we centralize Joshua's configuration system to make use 
> of typesafe/config https://github.com/typesafehub/config .  This config 
> system looks like JSON but with comments so it's easy to read.  Because it's 
> JSON it supports hierarchies of configurations, lists of configuration etc 
> quite easily.  It has some nice features like parsing time automatically.  
> The main advantage here though is that we have a standard config system that 
> doesn't have to be manually parsed.
> Here's a quick example of how we can use it:
> {code:java}
> @Inject
> public PackedGrammar(@TypesafeConfig("PackedGrammar.grammar_dir")
>  String grammar_dir,
>  @TypesafeConfig("PackedGrammar.span_limit")
>  int span_limit, 
>  String owner, 
>  String type) throws FileNotFoundException, 
> IOException ...
> {code}
> and then a config similar to
> \# Joshua configuration file
> {code:javascript}
> config = {
> default-non-terminal = X
> goal-symbol = GOAL
> ...
> 
> PackedGrammar: {
> type: thrax,
> grammar_dir: /local/grammars/...
> span_limit: 50
> }
> ...
> }
> {code}
> Version: TBD, but it's a breaking change so we may consider putting it in 
> Joshua 7.
> Totally open to other config / injection systems if others want to suggest 
> any of their favorites.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-265) Refactor key interfaces and core code for a future release.

2016-06-20 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-265:

Fix Version/s: 6.2

> Refactor key interfaces and core code for a future release. 
> 
>
> Key: JOSHUA-265
> URL: https://issues.apache.org/jira/browse/JOSHUA-265
> Project: Joshua
>  Issue Type: Improvement
>Reporter: Kellen Sunderland
>Priority: Minor
> Fix For: 6.2
>
>
> We've discussed making some modifications to the key interfaces.  This ticket 
> can focus on making large changes to the codebase for a future release.  This 
> work will likely take some time and some collaboration.  I'd suggest some the 
> code for this be a separate release branch.
> Some issues we can work on:
> *  I'd propose we conform to the SOLID principles for our major interfaces.  
> https://en.wikipedia.org/wiki/SOLID_(object-oriented_design)  . 
> *  We can look at Sparse / Dense feature vectors and how to handle them 
> naturally in Joshua.
> *  Refactor objects that may now be used more broadly than was originally 
> intended (for example Vocabulary class).
> *  We should have a general discussion around what parts of the codebase are 
> responsible for what functions.  We should clearly define what logic should 
> be a part of the Grammar versus the Feature Functions for example, and make 
> sure logic doesn't leak from one of these objects to the others.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-275) Revamp the Configuration System

2016-06-20 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-275:

Fix Version/s: 6,2

> Revamp the Configuration System
> ---
>
> Key: JOSHUA-275
> URL: https://issues.apache.org/jira/browse/JOSHUA-275
> Project: Joshua
>  Issue Type: Improvement
>Affects Versions: 6.1, 6.2, 7
>Reporter: Kellen Sunderland
> Fix For: 6,2
>
>
> I'd like to propose we centralize Joshua's configuration system to make use 
> of typesafe/config https://github.com/typesafehub/config .  This config 
> system looks like JSON but with comments so it's easy to read.  Because it's 
> JSON it supports hierarchies of configurations, lists of configuration etc 
> quite easily.  It has some nice features like parsing time automatically.  
> The main advantage here though is that we have a standard config system that 
> doesn't have to be manually parsed.
> Here's a quick example of how we can use it:
> {code:java}
> @Inject
> public PackedGrammar(@TypesafeConfig("PackedGrammar.grammar_dir")
>  String grammar_dir,
>  @TypesafeConfig("PackedGrammar.span_limit")
>  int span_limit, 
>  String owner, 
>  String type) throws FileNotFoundException, 
> IOException ...
> {code}
> and then a config similar to
> \# Joshua configuration file
> {code:javascript}
> config = {
> default-non-terminal = X
> goal-symbol = GOAL
> ...
> 
> PackedGrammar: {
> type: thrax,
> grammar_dir: /local/grammars/...
> span_limit: 50
> }
> ...
> }
> {code}
> Version: TBD, but it's a breaking change so we may consider putting it in 
> Joshua 7.
> Totally open to other config / injection systems if others want to suggest 
> any of their favorites.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-268) Phrase-based model error (NullPointerException)

2016-06-20 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-268:

Fix Version/s: 6.2

> Phrase-based model error (NullPointerException)
> ---
>
> Key: JOSHUA-268
> URL: https://issues.apache.org/jira/browse/JOSHUA-268
> Project: Joshua
>  Issue Type: Bug
>  Components: decoders
>Affects Versions: 6.0.5
> Environment: fedora 23
>Reporter: Kyle Richardson
>Priority: Minor
> Fix For: 6.2
>
>
> I'm trying to run the phrase.sh example script (the only modification I made 
> was to take out the --optimizer-runs option, because the system says that 
> this is an "Unknown option"). 
> The error comes at the tuning stage (specifically, it fails at some point in 
> the tuning then complains that it cannot find the "joshua.config.final" 
> file). 
> Looking into the log file (tune/joshua.log), it seems to translate and tune a 
> number of sentences, then it raises the following NullPointerException: 
> Memory used after sentence 7 is 42.5 MB
> Translation 7: -30.617 good how is fine
> Input 2: Collecting options took 0.000 seconds
> Input 8: Collecting options took 0.000 seconds
> Input 2: FATAL UNCAUGHT EXCEPTION: null
> java.lang.NullPointerException
> at joshua.decoder.phrase.Candidate.score(Candidate.java:214)
> at joshua.decoder.phrase.Candidate.compareTo(Candidate.java:136)
> at joshua.decoder.phrase.Candidate.compareTo(Candidate.java:19)
> at java.util.HashMap.compareComparables(HashMap.java:371)
> at java.util.HashMap$TreeNode.treeify(HashMap.java:1920)
> at java.util.HashMap.treeifyBin(HashMap.java:771)
> at java.util.HashMap.putVal(HashMap.java:643)
> at java.util.HashMap.put(HashMap.java:611)
> at java.util.HashSet.add(HashSet.java:219)
> at joshua.decoder.phrase.Stack.addCandidate(Stack.java:125)
> at joshua.decoder.phrase.Stacks.search(Stacks.java:166)
> at joshua.decoder.DecoderThread.translate(DecoderThread.java:113)
> atjoshua.decoder.Decoder$DecoderThreadRunner.run(Decoder.java:218)
> There's nothing informative in the tune/mert.log, it just says that it exited 
> prematurely. The other processes seem to work as expected (although in the 
> giza.log, there are a number of "Sentence mismatch error! Line " warnings). 
> I'm running this on Fedora 23  with Moses.  I had no problems training the 
> hiero model.
> note---
> There appears to be an open ticket for more or less the same problem 
> (JOSHUA-267), the difference however is that in that in this ticket, it 
> appears that the tuner fails on the first input, whereas here, it already 
> decodes/tunes several inputs before failing (see above). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (JOSHUA-253) Enable execution of Unit tests

2016-06-02 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved JOSHUA-253.
-
Resolution: Fixed

yeah we fixed it in the Maven work

> Enable execution of Unit tests
> --
>
> Key: JOSHUA-253
> URL: https://issues.apache.org/jira/browse/JOSHUA-253
> Project: Joshua
>  Issue Type: Test
>Affects Versions: 6.0
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
> Attachments: JOSHUA-253.patch
>
>
> As per our [discussion on this 
> topic|http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg00270.html],
>  [~teofili] correctly identified that unit level tests are not executed.
> We need to fix this such that they are.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (JOSHUA-253) Enable execution of Unit tests

2016-06-02 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reassigned JOSHUA-253:
---

Assignee: Lewis John McGibbney

> Enable execution of Unit tests
> --
>
> Key: JOSHUA-253
> URL: https://issues.apache.org/jira/browse/JOSHUA-253
> Project: Joshua
>  Issue Type: Test
>Affects Versions: 6.0
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
> Attachments: JOSHUA-253.patch
>
>
> As per our [discussion on this 
> topic|http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg00270.html],
>  [~teofili] correctly identified that unit level tests are not executed.
> We need to fix this such that they are.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-276) Trivial fixes to 1.8 Javadoc

2016-05-31 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-276:
---

 Summary: Trivial fixes to 1.8 Javadoc
 Key: JOSHUA-276
 URL: https://issues.apache.org/jira/browse/JOSHUA-276
 Project: Joshua
  Issue Type: Bug
  Components: core
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Priority: Trivial
 Fix For: 6.1


There are some trivial Javadoc issues to be fixed in now master branch
{code}
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 37.358s
[INFO] Finished at: Wed Jun 01 03:28:40 UTC 2016

[INFO] Final Memory: 40M/861M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-javadoc-plugin:2.8:aggregate (default-cli) on 
project joshua: An error has occurred in JavaDocs report generation:
[ERROR] Exit code: 1 - 
/home/jenkins/jenkins-slave/workspace/joshua_master/src/main/java/org/apache/joshua/decoder/StructuredTranslationFactory.java:47:
 warning: no description for @param
[ERROR] * @param sourceSentence
[ERROR] ^
[ERROR] 
/home/jenkins/jenkins-slave/workspace/joshua_master/src/main/java/org/apache/joshua/decoder/StructuredTranslationFactory.java:48:
 warning: no description for @param
[ERROR] * @param hypergraph
[ERROR] ^
[ERROR] 
/home/jenkins/jenkins-slave/workspace/joshua_master/src/main/java/org/apache/joshua/decoder/StructuredTranslationFactory.java:49:
 warning: no description for @param
[ERROR] * @param featureFunctions
[ERROR] ^
[ERROR] 
/home/jenkins/jenkins-slave/workspace/joshua_master/src/main/java/org/apache/joshua/decoder/ff/FeatureVector.java:80:
 error: reference not found
[ERROR] * features) and in {@link 
org.apache.joshua.decoder.ff.tm.BilingualRule#estimateRuleCost(java.util.List)}
[ERROR] ^
[ERROR] 
[ERROR] Command line was: 
/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.8/jre/../bin/javadoc 
@options @packages
[ERROR] 
[ERROR] Refer to the generated Javadoc files in 
'/home/jenkins/jenkins-slave/workspace/joshua_master/target/site/apidocs' dir.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
Build step 'Invoke top-level Maven targets' marked build as failure
Publishing Javadoc

{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-252) Make it possible to use Maven to build Joshua

2016-05-31 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15309166#comment-15309166
 ] 

Lewis John McGibbney commented on JOSHUA-252:
-

ACK done
https://builds.apache.org/view/H-L/view/Joshua/job/joshua_master/
There is a transient build slave error which I'll try and sort out.
[~post] NICE WORK :) 

> Make it possible to use Maven to build Joshua
> -
>
> Key: JOSHUA-252
> URL: https://issues.apache.org/jira/browse/JOSHUA-252
> Project: Joshua
>  Issue Type: Improvement
>  Components: build
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 6.1
>
>
> As per discussion on the dev@ list for now Ant is the official build tool for 
> Joshua however we would like to possibly switch to Maven if / when someone is 
> able to do so.
> Assigning to me for now as I could be able to look into this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-271) Thrax invocation should not reply upon $HADOOP being set

2016-05-24 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-271:
---

 Summary: Thrax invocation should not reply upon $HADOOP being set
 Key: JOSHUA-271
 URL: https://issues.apache.org/jira/browse/JOSHUA-271
 Project: Joshua
  Issue Type: Bug
  Components: pipeline, thrax
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
 Fix For: 6.1


Right now one cannot run thrax unless the $HADOOP env variable is defined. 
Every time the hadoop script is invoked it means that the path is coded as 
$HADOOP/bin/hadoop however what happens if you are using a VM (Vagrant) to 
connect to a cluster for which no $HADOOP env variable is defined? 
The hadoop script should be on the path and available to use from there. The 
only check which should be made is whether it is available from the path or 
not, if it is not then start_hadoop_cluster subroutine can be called. This 
reduces code and makes more sense.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-270) pipeline.pl needs major refactoring

2016-05-24 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-270:
---

 Summary: pipeline.pl needs major refactoring
 Key: JOSHUA-270
 URL: https://issues.apache.org/jira/browse/JOSHUA-270
 Project: Joshua
  Issue Type: Bug
  Components: pipeline
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
 Fix For: 6.1


Right now 
[pipeline.pl|https://github.com/apache/incubator-joshua/blob/master/scripts/training/pipeline.pl]
 is well over 2000 lines long and extremely difficult to navigate. 
I propose the following
 * All ENV is refactored into an pipeline_environment file
 * All Command line parsing and definitions are refactored into a pipeline_cli 
file
 * Sanity checking is refactored into a pipeline_sanity_check file
 * Dependenct Variable Checking is refactored into 
pipeline_dependent_variable_setting file
 * filter and preprocess corpora is refactored into 
pipeline_filter_preprocess_corpora
 * pipeline_subsampling becomes a file
 * pipeline_alignment becomes a file
 * pipeline_parsing becomes a file
 * pipeline_thrax becomes a file
 * pipeline_tuning becomes a file
 * pipeline_testing becomes a file
 * pipeline_subreoutines becomes a file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-262) Implement all logging as Slf4j over Log4j

2016-05-20 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294656#comment-15294656
 ] 

Lewis John McGibbney commented on JOSHUA-262:
-

I honestly have no idea.

> Implement all logging as Slf4j over Log4j
> -
>
> Key: JOSHUA-262
> URL: https://issues.apache.org/jira/browse/JOSHUA-262
> Project: Joshua
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Assignee: Thamme Gowda N
> Fix For: 6.1
>
>
> [~hsaputra] suggested that we implement all logging as Slf4j over Log4j. If 
> we use [parameterized logging 
> notation|http://www.slf4j.org/faq.html#logging_performance] we can have good 
> logging in place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-252) Make it possible to use Maven to build Joshua

2016-05-13 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283123#comment-15283123
 ] 

Lewis John McGibbney commented on JOSHUA-252:
-

[~teofili] I am working on this today I will post a pull request ASAP

> Make it possible to use Maven to build Joshua
> -
>
> Key: JOSHUA-252
> URL: https://issues.apache.org/jira/browse/JOSHUA-252
> Project: Joshua
>  Issue Type: Improvement
>  Components: build
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 6.1
>
>
> As per discussion on the dev@ list for now Ant is the official build tool for 
> Joshua however we would like to possibly switch to Maven if / when someone is 
> able to do so.
> Assigning to me for now as I could be able to look into this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-252) Make it possible to use Maven to build Joshua

2016-05-13 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-252:

Component/s: build

> Make it possible to use Maven to build Joshua
> -
>
> Key: JOSHUA-252
> URL: https://issues.apache.org/jira/browse/JOSHUA-252
> Project: Joshua
>  Issue Type: Improvement
>  Components: build
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 6.1
>
>
> As per discussion on the dev@ list for now Ant is the official build tool for 
> Joshua however we would like to possibly switch to Maven if / when someone is 
> able to do so.
> Assigning to me for now as I could be able to look into this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-259) Integration tests are failing

2016-05-13 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-259:

Fix Version/s: 6.1

> Integration tests are failing
> -
>
> Key: JOSHUA-259
> URL: https://issues.apache.org/jira/browse/JOSHUA-259
> Project: Joshua
>  Issue Type: Bug
>Reporter: Kellen Sunderland
> Fix For: 6.1
>
>
> Several integration tests are currently failing with Joshua.  I have a quick 
> fix coming for one of the tests but just in case we need more discussion 
> around the failures I'll open a bug.
> The currently failing tests for me:
> test/decoder/too-long
> test/server/http
> test/server/tcp-text
> test/thrax/extraction
> and 
> test/decoder/moses-compat (but this is easy to fix, simple extra space in the 
> expected file)
> These are failing under OS X 10.11.  If working under other environments feel 
> free to post a 'works for me'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-260) Integrate IoC (Inversion of Control) into Joshua

2016-05-13 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-260:

Fix Version/s: 6.1

> Integrate IoC (Inversion of Control) into Joshua
> 
>
> Key: JOSHUA-260
> URL: https://issues.apache.org/jira/browse/JOSHUA-260
> Project: Joshua
>  Issue Type: Improvement
>Reporter: Kellen Sunderland
>Assignee: Kellen Sunderland
> Fix For: 6.1
>
>
> I'd like to propose we investigate looking into using guice 
> (https://github.com/google/guice) in conjunction with joshua's configuration 
> system.  I believe it would give us a nice way to map what is in the 
> configuration to the code paths, and implementations used within Joshua.  It 
> also would go a long way to allowing us to integrate unit tests throughout 
> all the important classes in Joshua.  What does everyone think?  Would IoC be 
> a good pattern to adopt?  Is everyone ok with using guice (versus say some 
> other IoC library).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-262) Implement all logging as Slf4j over Log4j

2016-05-13 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-262:

Component/s: core

> Implement all logging as Slf4j over Log4j
> -
>
> Key: JOSHUA-262
> URL: https://issues.apache.org/jira/browse/JOSHUA-262
> Project: Joshua
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
> Fix For: 6.1
>
>
> [~hsaputra] suggested that we implement all logging as Slf4j over Log4j. If 
> we use [parameterized logging 
> notation|http://www.slf4j.org/faq.html#logging_performance] we can have good 
> logging in place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-262) Implement all logging as Slf4j over Log4j

2016-05-13 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-262:
---

 Summary: Implement all logging as Slf4j over Log4j
 Key: JOSHUA-262
 URL: https://issues.apache.org/jira/browse/JOSHUA-262
 Project: Joshua
  Issue Type: Improvement
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
 Fix For: 6.1


[~hsaputra] suggested that we implement all logging as Slf4j over Log4j. If we 
use [parameterized logging 
notation|http://www.slf4j.org/faq.html#logging_performance] we can have good 
logging in place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-261) Remove ext directory from source tree

2016-05-09 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276506#comment-15276506
 ] 

Lewis John McGibbney commented on JOSHUA-261:
-

In all honesty, the code can remain in the source tree in SCM but we just can't 
ship it with a release. 

> Remove ext directory from source tree
> -
>
> Key: JOSHUA-261
> URL: https://issues.apache.org/jira/browse/JOSHUA-261
> Project: Joshua
>  Issue Type: Task
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> Right now we have a bunch of cofe bundled in to the 
> [ext|https://github.com/apache/incubator-joshua/tree/master/ext] directory. I 
> don't think any of this code can be shipped with an Apache Joshua 
> (Incubating) release so we need to think about a mechanism for removing it 
> and making Joshua work in other ways.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >