Re: Joshua

2017-08-14 Thread John Hewitt
By deleting grammar.gz and re-running the pipeline script, Joshua will
recognize that it needs to restart from the grammar step.

If the grammar.gz file is something like 24 bytes, then it's an empty
gzipped file.

-John

On Mon, Aug 14, 2017 at 5:57 AM, Arezoo Arjomand 
wrote:

> Hi,
> i want to run Joshua on a server. how much disk space should i get to run
> the ldc corpus which is in the fisher folder?
> thank you
>


Re: Joshua

2017-08-14 Thread Arezoo Arjomand
Hi, 
i want to run Joshua on a server. how much disk space should i get to run the 
ldc corpus which is in the fisher folder?thank you


Re: Joshua

2017-08-14 Thread Arezoo Arjomand
how can i run from the grammar step?it could be from disk space? disk space is 
68 GB... 



Re: Joshua

2017-08-14 Thread Arezoo Arjomand
 how can i run form the grammar step?

 

On Monday, August 14, 2017 4:58 AM, Matt Post  wrote:
 

 It looks like grammar sorting is failing. Check the logs to see why. Delete 
grammar.* and try again from that step.


On Aug 14, 2017, at 10:49 AM, Arezoo Arjomand  wrote:
Hi,I add "--alignergiza" to terminal command. It seems the alignment error is 
fixed  but the grammer error is still remain both for berkeley aligner and 
giza. grammar.gz is empty and the runing dir is attached. 
 
 

On Monday, August 14, 2017 2:08 AM, Matt Post  wrote:
 

 It looks like alignment failed. Is there a file alignments/training.align? 
That is build from the two pieces, under alignments/0/giza.SRC-TRG (and 
TRG-SRC) that failed.


On Aug 13, 2017, at 7:21 PM, Arezoo Arjomand  wrote:
Hi,When I run the pipleline the following error is shown. The previous error , 
write in the previous email, is shown when i run the same dir for second time 
and grammar.gz is empty. 
 how can i fix the folloewing error? 

[source-numlines] rebuilding...
  dep=/home/arezoo1/joshua-tutorial/runs/02/data/train/corpus.es [CHANGED]
  cmd=cat /home/arezoo1/joshua-tutorial/runs/02/data/train/corpus.es | wc -l
  took 0 seconds (0s)
[source-numlines] retrieved cached result => 77457
[giza-0] rebuilding...
  dep=/home/arezoo1/joshua-tutorial/runs/02/data/train/splits/0/corpus.es 
[CHANGED]
  dep=/home/arezoo1/joshua-tutorial/runs/02/data/train/splits/0/corpus.en 
[CHANGED]
  dep=alignments/0/model/aligned.grow-diag-final [NOT FOUND]
  cmd=rm -f alignments/0/corpus.0-0.*; 
/home/arezoo1/joshua-tutorial/joshua/scripts/training/run-giza.pl --root-dir 
alignments/0 -e en -f es -corpus 
/home/arezoo1/joshua-tutorial/runs/02/data/train/splits/0/corpus -merge 
grow-diag-final  > alignments/0/giza.log 2>&1
*** Error in `/home/arezoo1/joshua-tutorial/joshua/ext/symal/symal': double 
free or corruption (out): 0x556a69b42160 ***
=== Backtrace: =
/lib/x86_64-linux-gnu/libc.so.6(+0x7908b)[0x7f91d0fb908b]
/lib/x86_64-linux-gnu/libc.so.6(+0x826fa)[0x7f91d0fc26fa]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f91d0fc612c]
/home/arezoo1/joshua-tutorial/joshua/ext/symal/symal(+0x2b5a)[0x556a6993ab5a]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f91d0f603f1]
/home/arezoo1/joshua-tutorial/joshua/ext/symal/symal(+0x5f4a)[0x556a6993df4a]
=== Memory map: 
556a69938000-556a69941000 r-xp  08:0a 1051501    
/home/arezoo1/joshua-tutorial/joshua/ext/symal/symal
556a69b41000-556a69b42000 r--p 9000 08:0a 1051501    
/home/arezoo1/joshua-tutorial/joshua/ext/symal/symal
556a69b42000-556a69b43000 rw-p a000 08:0a 1051501    
/home/arezoo1/joshua-tutorial/joshua/ext/symal/symal
556a69b43000-556a69b45000 rw-p  00:00 0 
556a6af09000-556a6afbf000 rw-p  00:00 0  [heap]
7f91cc00-7f91cc021000 rw-p  00:00 0 
7f91cc021000-7f91d000 ---p  00:00 0 
7f91d0c37000-7f91d0d3f000 r-xp  08:0a 1708999    
/lib/x86_64-linux-gnu/libm-2.24.so
7f91d0d3f000-7f91d0f3e000 ---p 00108000 08:0a 1708999    
/lib/x86_64-linux-gnu/libm-2.24.so
7f91d0f3e000-7f91d0f3f000 r--p 00107000 08:0a 1708999    
/lib/x86_64-linux-gnu/libm-2.24.so
7f91d0f3f000-7f91d0f4 rw-p 00108000 08:0a 1708999    
/lib/x86_64-linux-gnu/libm-2.24.so
7f91d0f4-7f91d10fd000 r-xp  08:0a 1708931    
/lib/x86_64-linux-gnu/libc-2.24.so
7f91d10fd000-7f91d12fd000 ---p 001bd000 08:0a 1708931    
/lib/x86_64-linux-gnu/libc-2.24.so
7f91d12fd000-7f91d1301000 r--p 001bd000 08:0a 1708931    
/lib/x86_64-linux-gnu/libc-2.24.so
7f91d1301000-7f91d1303000 rw-p 001c1000 08:0a 1708931    
/lib/x86_64-linux-gnu/libc-2.24.so
7f91d1303000-7f91d1307000 rw-p  00:00 0 
7f91d1307000-7f91d131d000 r-xp  08:0a 1708971    
/lib/x86_64-linux-gnu/libgcc_s.so.1
7f91d131d000-7f91d151c000 ---p 00016000 08:0a 1708971    
/lib/x86_64-linux-gnu/libgcc_s.so.1
7f91d151c000-7f91d151d000 r--p 00015000 08:0a 1708971    
/lib/x86_64-linux-gnu/libgcc_s.so.1
7f91d151d000-7f91d151e000 rw-p 00016000 08:0a 1708971    
/lib/x86_64-linux-gnu/libgcc_s.so.1
7f91d151e000-7f91d1697000 r-xp  08:0a 1976366    
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22
7f91d1697000-7f91d1896000 ---p 00179000 08:0a 1976366    
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22
7f91d1896000-7f91d18a r--p 00178000 08:0a 1976366    
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22
7f91d18a-7f91d18a2000 rw-p 00182000 08:0a 1976366    
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22
7f91d18a2000-7f91d18a6000 rw-p  00:00 0 
7f91d18a6000-7f91d18cb000 r-xp 

Re: Joshua

2017-08-14 Thread Arezoo Arjomand
how can i run from the grammar step? 
 --
Best Regards
Arezoo Arjomandzadeh
MSc student in Artificial Intelligence
Computer & IT engineering 
Shahrood University of Technology, Iran
 

On Monday, August 14, 2017 4:58 AM, Matt Post  wrote:
 

 It looks like grammar sorting is failing. Check the logs to see why. Delete 
grammar.* and try again from that step.


On Aug 14, 2017, at 10:49 AM, Arezoo Arjomand  wrote:
Hi,I add "--alignergiza" to terminal command. It seems the alignment error is 
fixed  but the grammer error is still remain both for berkeley aligner and 
giza. grammar.gz is empty and the runing dir is attached. 
 
 

On Monday, August 14, 2017 2:08 AM, Matt Post  wrote:
 

 It looks like alignment failed. Is there a file alignments/training.align? 
That is build from the two pieces, under alignments/0/giza.SRC-TRG (and 
TRG-SRC) that failed.


On Aug 13, 2017, at 7:21 PM, Arezoo Arjomand  wrote:
Hi,When I run the pipleline the following error is shown. The previous error , 
write in the previous email, is shown when i run the same dir for second time 
and grammar.gz is empty. 
 how can i fix the folloewing error? 

[source-numlines] rebuilding...
  dep=/home/arezoo1/joshua-tutorial/runs/02/data/train/corpus.es [CHANGED]
  cmd=cat /home/arezoo1/joshua-tutorial/runs/02/data/train/corpus.es | wc -l
  took 0 seconds (0s)
[source-numlines] retrieved cached result => 77457
[giza-0] rebuilding...
  dep=/home/arezoo1/joshua-tutorial/runs/02/data/train/splits/0/corpus.es 
[CHANGED]
  dep=/home/arezoo1/joshua-tutorial/runs/02/data/train/splits/0/corpus.en 
[CHANGED]
  dep=alignments/0/model/aligned.grow-diag-final [NOT FOUND]
  cmd=rm -f alignments/0/corpus.0-0.*; 
/home/arezoo1/joshua-tutorial/joshua/scripts/training/run-giza.pl --root-dir 
alignments/0 -e en -f es -corpus 
/home/arezoo1/joshua-tutorial/runs/02/data/train/splits/0/corpus -merge 
grow-diag-final  > alignments/0/giza.log 2>&1
*** Error in `/home/arezoo1/joshua-tutorial/joshua/ext/symal/symal': double 
free or corruption (out): 0x556a69b42160 ***
=== Backtrace: =
/lib/x86_64-linux-gnu/libc.so.6(+0x7908b)[0x7f91d0fb908b]
/lib/x86_64-linux-gnu/libc.so.6(+0x826fa)[0x7f91d0fc26fa]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f91d0fc612c]
/home/arezoo1/joshua-tutorial/joshua/ext/symal/symal(+0x2b5a)[0x556a6993ab5a]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f91d0f603f1]
/home/arezoo1/joshua-tutorial/joshua/ext/symal/symal(+0x5f4a)[0x556a6993df4a]
=== Memory map: 
556a69938000-556a69941000 r-xp  08:0a 1051501    
/home/arezoo1/joshua-tutorial/joshua/ext/symal/symal
556a69b41000-556a69b42000 r--p 9000 08:0a 1051501    
/home/arezoo1/joshua-tutorial/joshua/ext/symal/symal
556a69b42000-556a69b43000 rw-p a000 08:0a 1051501    
/home/arezoo1/joshua-tutorial/joshua/ext/symal/symal
556a69b43000-556a69b45000 rw-p  00:00 0 
556a6af09000-556a6afbf000 rw-p  00:00 0  [heap]
7f91cc00-7f91cc021000 rw-p  00:00 0 
7f91cc021000-7f91d000 ---p  00:00 0 
7f91d0c37000-7f91d0d3f000 r-xp  08:0a 1708999    
/lib/x86_64-linux-gnu/libm-2.24.so
7f91d0d3f000-7f91d0f3e000 ---p 00108000 08:0a 1708999    
/lib/x86_64-linux-gnu/libm-2.24.so
7f91d0f3e000-7f91d0f3f000 r--p 00107000 08:0a 1708999    
/lib/x86_64-linux-gnu/libm-2.24.so
7f91d0f3f000-7f91d0f4 rw-p 00108000 08:0a 1708999    
/lib/x86_64-linux-gnu/libm-2.24.so
7f91d0f4-7f91d10fd000 r-xp  08:0a 1708931    
/lib/x86_64-linux-gnu/libc-2.24.so
7f91d10fd000-7f91d12fd000 ---p 001bd000 08:0a 1708931    
/lib/x86_64-linux-gnu/libc-2.24.so
7f91d12fd000-7f91d1301000 r--p 001bd000 08:0a 1708931    
/lib/x86_64-linux-gnu/libc-2.24.so
7f91d1301000-7f91d1303000 rw-p 001c1000 08:0a 1708931    
/lib/x86_64-linux-gnu/libc-2.24.so
7f91d1303000-7f91d1307000 rw-p  00:00 0 
7f91d1307000-7f91d131d000 r-xp  08:0a 1708971    
/lib/x86_64-linux-gnu/libgcc_s.so.1
7f91d131d000-7f91d151c000 ---p 00016000 08:0a 1708971    
/lib/x86_64-linux-gnu/libgcc_s.so.1
7f91d151c000-7f91d151d000 r--p 00015000 08:0a 1708971    
/lib/x86_64-linux-gnu/libgcc_s.so.1
7f91d151d000-7f91d151e000 rw-p 00016000 08:0a 1708971    
/lib/x86_64-linux-gnu/libgcc_s.so.1
7f91d151e000-7f91d1697000 r-xp  08:0a 1976366    
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22
7f91d1697000-7f91d1896000 ---p 00179000 08:0a 1976366    
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22
7f91d1896000-7f91d18a r--p 00178000 08:0a 1976366    
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22
7f91d18a-7f91d18a2000 rw-p 00182000 08:0a 1976366  

Re: Joshua

2017-08-14 Thread Matt Post
It looks like grammar sorting is failing. Check the logs to see why. Delete 
grammar.* and try again from that step.


> On Aug 14, 2017, at 10:49 AM, Arezoo Arjomand  > wrote:
> 
> Hi,
> I add "--aligner giza" to terminal command. It seems the alignment error is 
> fixed  but the grammer error is still remain both for berkeley aligner and 
> giza. grammar.gz is empty and the runing dir is attached. 
>  
> 
> 
> 
> On Monday, August 14, 2017 2:08 AM, Matt Post  > wrote:
> 
> 
> It looks like alignment failed. Is there a file alignments/training.align? 
> That is build from the two pieces, under alignments/0/giza.SRC-TRG (and 
> TRG-SRC) that failed.
> 
> 
>> On Aug 13, 2017, at 7:21 PM, Arezoo Arjomand > > wrote:
>> 
>> Hi,
>> When I run the pipleline the following error is shown. The previous error , 
>> write in the previous email, is shown when i run the same dir for second 
>> time and grammar.gz is empty. 
>>  how can i fix the folloewing error? 
>> 
>> [source-numlines] rebuilding...
>>   dep=/home/arezoo1/joshua-tutorial/runs/02/data/train/corpus.es 
>>  [CHANGED]
>>   cmd=cat /home/arezoo1/joshua-tutorial/runs/02/data/train/corpus.es 
>>  | wc -l
>>   took 0 seconds (0s)
>> [source-numlines] retrieved cached result => 77457
>> [giza-0] rebuilding...
>>   dep=/home/arezoo1/joshua-tutorial/runs/02/data/train/splits/0/corpus.es 
>>  [CHANGED]
>>   dep=/home/arezoo1/joshua-tutorial/runs/02/data/train/splits/0/corpus.en 
>> [CHANGED]
>>   dep=alignments/0/model/aligned.grow-diag-final [NOT FOUND]
>>   cmd=rm -f alignments/0/corpus.0-0.*; 
>> /home/arezoo1/joshua-tutorial/joshua/scripts/training/run-giza.pl --root-dir 
>> alignments/0 -e en -f es -corpus 
>> /home/arezoo1/joshua-tutorial/runs/02/data/train/splits/0/corpus -merge 
>> grow-diag-final  > alignments/0/giza.log 2>&1
>> *** Error in `/home/arezoo1/joshua-tutorial/joshua/ext/symal/symal': double 
>> free or corruption (out): 0x556a69b42160 ***
>> === Backtrace: =
>> /lib/x86_64-linux-gnu/libc.so.6(+0x7908b)[0x7f91d0fb908b]
>> /lib/x86_64-linux-gnu/libc.so.6(+0x826fa)[0x7f91d0fc26fa]
>> /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f91d0fc612c]
>> /home/arezoo1/joshua-tutorial/joshua/ext/symal/symal(+0x2b5a)[0x556a6993ab5a]
>> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f91d0f603f1]
>> /home/arezoo1/joshua-tutorial/joshua/ext/symal/symal(+0x5f4a)[0x556a6993df4a]
>> === Memory map: 
>> 556a69938000-556a69941000 r-xp  08:0a 1051501
>> /home/arezoo1/joshua-tutorial/joshua/ext/symal/symal
>> 556a69b41000-556a69b42000 r--p 9000 08:0a 1051501
>> /home/arezoo1/joshua-tutorial/joshua/ext/symal/symal
>> 556a69b42000-556a69b43000 rw-p a000 08:0a 1051501
>> /home/arezoo1/joshua-tutorial/joshua/ext/symal/symal
>> 556a69b43000-556a69b45000 rw-p  00:00 0 
>> 556a6af09000-556a6afbf000 rw-p  00:00 0  
>> [heap]
>> 7f91cc00-7f91cc021000 rw-p  00:00 0 
>> 7f91cc021000-7f91d000 ---p  00:00 0 
>> 7f91d0c37000-7f91d0d3f000 r-xp  08:0a 1708999
>> /lib/x86_64-linux-gnu/libm-2.24.so
>> 7f91d0d3f000-7f91d0f3e000 ---p 00108000 08:0a 1708999
>> /lib/x86_64-linux-gnu/libm-2.24.so
>> 7f91d0f3e000-7f91d0f3f000 r--p 00107000 08:0a 1708999
>> /lib/x86_64-linux-gnu/libm-2.24.so
>> 7f91d0f3f000-7f91d0f4 rw-p 00108000 08:0a 1708999
>> /lib/x86_64-linux-gnu/libm-2.24.so
>> 7f91d0f4-7f91d10fd000 r-xp  08:0a 1708931
>> /lib/x86_64-linux-gnu/libc-2.24.so
>> 7f91d10fd000-7f91d12fd000 ---p 001bd000 08:0a 1708931
>> /lib/x86_64-linux-gnu/libc-2.24.so
>> 7f91d12fd000-7f91d1301000 r--p 001bd000 08:0a 1708931
>> /lib/x86_64-linux-gnu/libc-2.24.so
>> 7f91d1301000-7f91d1303000 rw-p 001c1000 08:0a 1708931
>> /lib/x86_64-linux-gnu/libc-2.24.so
>> 7f91d1303000-7f91d1307000 rw-p  00:00 0 
>> 7f91d1307000-7f91d131d000 r-xp  08:0a 1708971
>> /lib/x86_64-linux-gnu/libgcc_s.so.1
>> 7f91d131d000-7f91d151c000 ---p 00016000 08:0a 1708971
>> /lib/x86_64-linux-gnu/libgcc_s.so.1
>> 7f91d151c000-7f91d151d000 r--p 00015000 08:0a 1708971
>> /lib/x86_64-linux-gnu/libgcc_s.so.1
>> 7f91d151d000-7f91d151e000 rw-p 00016000 08:0a 1708971
>> /lib/x86_64-linux-gnu/libgcc_s.so.1
>> 7f91d151e000-7f91d1697000 r-xp  08:0a 1976366
>> /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22
>> 7f91d1697000-7f91d1896000 ---p 00179000 08:0a 1976366
>> /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22
>> 

Re: joshua prints hyphens instead of translation

2017-07-16 Thread Matt Post
Hi Nicoara,

I finally had a minute to look at this, and it seems to me that everything is 
working fine, and this is just the normal kind of noise you might expect from 
MT systems when tested on data that is different from what they are trained on. 
You have somehow picked two sentences ("hello" and "how are you") that are not 
translated well, but others seem to work fine:

$ pwd
$HOME/language-packs/apache-joshua-en-de-2017-01-31
$ cat example.en 
hello
how are you
this is a test
This is a test .
Those who hurt others hurt themselves .
I think this event is best described as the state enforcing surveillance as the 
normative form of care
$ cat example.en  | ./joshua
-
-
Dies ist ein test
Dies ist ein test.
Verletzt die selbst verletzt worden sein.
Das ereignis dagegen, da der staat als form der versorgung normativen 
durchsetzung überwachung

The model we have provided is a relatively small phrase-based model trained 
mostly on news data. One would hope that it would get sentences like you 
provided, but I am not too surprised it didn't do very well.

matt


> On Jul 11, 2017, at 5:09 PM, Nicoara Talpes  > wrote:
> 
> Hello,
> 
> I think this is an important issue to solve for the following reasons:
> 
> 1) this is one of only two language packs that have both *-en and en-*. 
> 2) there seems to be no solution to running the language pack on Windows
> 3) German is a very circulated language
> 
> Please let me know when this direction en-de is resolved or if it is running 
> on any other machine (maybe mine has an issue).
> I have a project where I am attempting to make use of this language pack 
> specifically.
> 
> Thank you,
> Nicoara
> 
> On Thu, Jul 6, 2017 at 5:19 PM, Matt Post  > wrote:
> Hi,
> 
> Something is clearly wrong but it is not obvious from the output. I hope to 
> look into this soon. You might try another en-* language pack to see if that 
> has the same problem in the meantime which would help isolate this.
> 
> matt
> 
>> On Jul 5, 2017, at 12:02 PM, Nicoara Talpes > > wrote:
>> 
>> Hello,
>> Is the response ok ?
>> Thank you
>> 
>> 
>> 
>> On July 4, 2017, at 7:59 PM, Nicoara Talpes > > wrote:
>> 
>> 
>> Hello,
>> 
>> Here it is:
>> 
>> nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ head 
>> example.SRC | ./prepare.sh  | ./joshua -v 1
>> INFO - Parameters read from configuration file: joshua.config
>> INFO - tm = 'moses -path model/grammar.gz.packed -maxspan 0 -owner pt'
>> INFO - defaultnonterminal = 'X'
>> INFO - goalsymbol = 'GOAL'
>> INFO - markoovs = 'false'
>> INFO - search = 'stack'
>> INFO - pop-limit: 100
>> INFO - poplimit = '100'
>> INFO - topn = '1'
>> INFO - useuniquenbest = 'true'
>> INFO - outputformat = '%S'
>> INFO - includealignindex = 'false'
>> INFO - featurefunction = 'OOVPenalty'
>> INFO - featurefunction = 'WordPenalty'
>> INFO - featurefunction = 'PhrasePenalty'
>> INFO - featurefunction = 'Distortion'
>> INFO - featurefunction = 'LanguageModel -lm_type berkeleylm -lm_order 4 
>> -lm_file model/lm.berkeleylm'
>> INFO - lowercase = 'true'
>> INFO - projectcase = 'true'
>> INFO - c = 'joshua.config'
>> INFO - v = '0'
>> INFO - v = '1'
>> INFO - Read 9 weights (0 of them dense)
>> INFO - Reading vocabulary: model/grammar.gz.packed/vocabulary
>> INFO - Read 1404929 entries from the vocabulary
>> INFO - Reading packed config: model/grammar.gz.packed/config
>> 102030405060708090.100%
>> INFO - Reading encoder configuration: model/grammar.gz.packed/encoding
>> INFO - Loaded 64487199 rules
>> INFO - Memory used 2360.054904 MB
>> INFO - Grammar loading took: 151 seconds.
>> INFO - Stateful object with state index 0
>> INFO - Loading Berkeley LM from binary model/lm.berkeleylm
>> INFO - FEATURE: tm_pt (weight 0.000)
>> INFO - FEATURE: OOVPenalty (weight 0.016)
>> INFO - FEATURE: WordPenalty (weight -0.279)
>> INFO - FEATURE: PhrasePenalty (weight 0.001)
>> INFO - FEATURE: Distortion (weight 0.123)
>> INFO - FEATURE: lm_0, order 4 (weight 0.314), classLm=false
>> INFO - Grammar sorting happening lazily on-demand.
>> INFO - Model loading took 176 seconds
>> INFO - Memory used 2823.330152 MB
>> INFO - Input 0:  hello 
>> INFO - Input 0: Collecting options took 0.0 seconds
>> INFO - Input 0: Search took 0.073 seconds
>> INFO - Input 0: Translation took 0.824 seconds
>> INFO - Input 0: Memory used is 2828.316456 MB
>> INFO - Input 0: 1-best extraction took 0.172 seconds
>> -
>> INFO - Input 1:  how are you 
>> INFO - Input 1: Collecting options took 0.0 seconds
>> INFO - Input 1: Search took 0.94 seconds
>> INFO - Input 1: Translation took 5.397 seconds
>> INFO - 

Re: joshua prints hyphens instead of translation

2017-07-11 Thread Nicoara Talpes
Hello,

I think this is an important issue to solve for the following reasons:

1) this is one of only two language packs that have both *-en and en-*.
2) there seems to be no solution to running the language pack on Windows
3) German is a very circulated language

Please let me know when this direction en-de is resolved or if it is
running on any other machine (maybe mine has an issue).
I have a project where I am attempting to make use of this language pack
specifically.

Thank you,
Nicoara

On Thu, Jul 6, 2017 at 5:19 PM, Matt Post  wrote:

> Hi,
>
> Something is clearly wrong but it is not obvious from the output. I hope
> to look into this soon. You might try another en-* language pack to see if
> that has the same problem in the meantime which would help isolate this.
>
> matt
>
> On Jul 5, 2017, at 12:02 PM, Nicoara Talpes 
> wrote:
>
> Hello,
> Is the response ok ?
> Thank you
>
>
> On July 4, 2017, at 7:59 PM, Nicoara Talpes 
> wrote:
>
>
> Hello,
>
> Here it is:
>
> nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ head
> example.SRC | ./prepare.sh | ./joshua -v 1
> INFO - Parameters read from configuration file: joshua.config
> INFO - tm = 'moses -path model/grammar.gz.packed -maxspan 0 -owner pt'
> INFO - defaultnonterminal = 'X'
> INFO - goalsymbol = 'GOAL'
> INFO - markoovs = 'false'
> INFO - search = 'stack'
> INFO - pop-limit: 100
> INFO - poplimit = '100'
> INFO - topn = '1'
> INFO - useuniquenbest = 'true'
> INFO - outputformat = '%S'
> INFO - includealignindex = 'false'
> INFO - featurefunction = 'OOVPenalty'
> INFO - featurefunction = 'WordPenalty'
> INFO - featurefunction = 'PhrasePenalty'
> INFO - featurefunction = 'Distortion'
> INFO - featurefunction = 'LanguageModel -lm_type berkeleylm -lm_order
> 4 -lm_file model/lm.berkeleylm'
> INFO - lowercase = 'true'
> INFO - projectcase = 'true'
> INFO - c = 'joshua.config'
> INFO - v = '0'
> INFO - v = '1'
> INFO - Read 9 weights (0 of them dense)
> INFO - Reading vocabulary: model/grammar.gz.packed/vocabulary
> INFO - Read 1404929 entries from the vocabulary
> INFO - Reading packed config: model/grammar.gz.packed/config
> 1020304050
> 60708090.100%
> INFO - Reading encoder configuration: model/grammar.gz.packed/encoding
> INFO - Loaded 64487199 rules
> INFO - Memory used 2360.054904 MB
> INFO - Grammar loading took: 151 seconds.
> INFO - Stateful object with state index 0
> INFO - Loading Berkeley LM from binary model/lm.berkeleylm
> INFO - FEATURE: tm_pt (weight 0.000)
> INFO - FEATURE: OOVPenalty (weight 0.016)
> INFO - FEATURE: WordPenalty (weight -0.279)
> INFO - FEATURE: PhrasePenalty (weight 0.001)
> INFO - FEATURE: Distortion (weight 0.123)
> INFO - FEATURE: lm_0, order 4 (weight 0.314), classLm=false
> INFO - Grammar sorting happening lazily on-demand.
> INFO - Model loading took 176 seconds
> INFO - Memory used 2823.330152 MB
> INFO - Input 0:  hello 
> INFO - Input 0: Collecting options took 0.0 seconds
> INFO - Input 0: Search took 0.073 seconds
> INFO - Input 0: Translation took 0.824 seconds
> INFO - Input 0: Memory used is 2828.316456 MB
> INFO - Input 0: 1-best extraction took 0.172 seconds
> -
> INFO - Input 1:  how are you 
> INFO - Input 1: Collecting options took 0.0 seconds
> INFO - Input 1: Search took 0.94 seconds
> INFO - Input 1: Translation took 5.397 seconds
> INFO - Input 1: Memory used is 2858.289328 MB
> INFO - Input 1: 1-best extraction took 0.468 seconds
> -
> INFO - Decoding completed.
> INFO - Memory used 2858.289328 MB
> INFO - Total running time: 183 seconds
>
> Did this output the translation anywhere?
>
> Also , can you tell me what is the command corresponding to this on
> Windows, I could try there.
>
> Thank you,
>
> Nicoara
>
>
> On Tue, Jul 4, 2017 at 7:33 PM, Matt Post  wrote:
>
>> Sorry, that should be "joshua -v 1", can you show that, please?
>>
>>
>> On Jul 4, 2017, at 12:25 PM, Nicoara Talpes 
>> wrote:
>>
>> Hello,
>>
>> Thanks for responding .
>>
>> Here it is the output, I hope it clarifies a little:
>>
>> nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ head
>> example.SRC
>> hello
>> how are you
>> nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ head
>> example.SRC | ./prepare.sh
>> hello
>> how are you
>> nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ head
>> example.SRC | ./prepare.sh | ./joshua -v
>> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 5
>> at org.apache.joshua.decoder.ArgsParser.(ArgsParser.java:60)
>> at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.
>> java:61)
>> nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ cat
>> example.SRC | ./prepare.sh | ./joshua
>> -
>> -
>>
>> How to proceed?
>>
>> 

Re: joshua prints hyphens instead of translation

2017-07-05 Thread Nicoara Talpes
Hello,
Is the response ok ?
Thank you

On July 4, 2017, at 7:59 PM, Nicoara Talpes  wrote:

Hello,

Here it is:

nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ head 
example.SRC | ./prepare.sh | ./joshua -v 1
INFO - Parameters read from configuration file: joshua.config
INFO - tm = 'moses -path model/grammar.gz.packed -maxspan 0 -owner pt'
INFO - defaultnonterminal = 'X'
INFO - goalsymbol = 'GOAL'
INFO - markoovs = 'false'
INFO - search = 'stack'
INFO - pop-limit: 100
INFO - poplimit = '100'
INFO - topn = '1'
INFO - useuniquenbest = 'true'
INFO - outputformat = '%S'
INFO - includealignindex = 'false'
INFO - featurefunction = 'OOVPenalty'
INFO - featurefunction = 'WordPenalty'
INFO - featurefunction = 'PhrasePenalty'
INFO - featurefunction = 'Distortion'
INFO - featurefunction = 'LanguageModel -lm_type berkeleylm -lm_order 4 
-lm_file model/lm.berkeleylm'
INFO - lowercase = 'true'
INFO - projectcase = 'true'
INFO - c = 'joshua.config'
INFO - v = '0'
INFO - v = '1'
INFO - Read 9 weights (0 of them dense)
INFO - Reading vocabulary: model/grammar.gz.packed/vocabulary
INFO - Read 1404929 entries from the vocabulary
INFO - Reading packed config: model/grammar.gz.packed/config
102030405060708090.100%
INFO - Reading encoder configuration: model/grammar.gz.packed/encoding
INFO - Loaded 64487199 rules
INFO - Memory used 2360.054904 MB
INFO - Grammar loading took: 151 seconds.
INFO - Stateful object with state index 0
INFO - Loading Berkeley LM from binary model/lm.berkeleylm
INFO - FEATURE: tm_pt (weight 0.000)
INFO - FEATURE: OOVPenalty (weight 0.016)
INFO - FEATURE: WordPenalty (weight -0.279)
INFO - FEATURE: PhrasePenalty (weight 0.001)
INFO - FEATURE: Distortion (weight 0.123)
INFO - FEATURE: lm_0, order 4 (weight 0.314), classLm=false
INFO - Grammar sorting happening lazily on-demand.
INFO - Model loading took 176 seconds
INFO - Memory used 2823.330152 MB
INFO - Input 0:  hello 
INFO - Input 0: Collecting options took 0.0 seconds
INFO - Input 0: Search took 0.073 seconds
INFO - Input 0: Translation took 0.824 seconds
INFO - Input 0: Memory used is 2828.316456 MB
INFO - Input 0: 1-best extraction took 0.172 seconds
-
INFO - Input 1:  how are you 
INFO - Input 1: Collecting options took 0.0 seconds
INFO - Input 1: Search took 0.94 seconds
INFO - Input 1: Translation took 5.397 seconds
INFO - Input 1: Memory used is 2858.289328 MB
INFO - Input 1: 1-best extraction took 0.468 seconds
-
INFO - Decoding completed.
INFO - Memory used 2858.289328 MB
INFO - Total running time: 183 seconds

Did this output the translation anywhere?

Also , can you tell me what is the command corresponding to this on Windows, I 
could try there.

Thank you,

Nicoara



On Tue, Jul 4, 2017 at 7:33 PM, Matt Post  wrote:

Sorry, that should be "joshua -v 1", can you show that, please?



On Jul 4, 2017, at 12:25 PM, Nicoara Talpes  wrote:


Hello,

Thanks for responding .

Here it is the output, I hope it clarifies a little:

nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ head example.SRC
hello 
how are you
nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ head 
example.SRC | ./prepare.sh 
hello
how are you
nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ head 
example.SRC | ./prepare.sh | ./joshua -v
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 5
    at org.apache.joshua.decoder.ArgsParser.(ArgsParser.java:60)
    at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:61)
nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ cat example.SRC 
| ./prepare.sh | ./joshua
-
-

How to proceed? 

Thank you


On Tue, Jul 4, 2017 at 4:31 PM, Matt Post  wrote:

Hi Nicoara,


Sorry, I seem to have missed your followup question. 


Can you please debug this a little? What do you get from the following commands?


head example.SRC

head example.SRC | ./prepare.sh

head example.SRC | ./prepare.sh | ./joshua -v


matt



On Jul 3, 2017, at 5:30 PM, Nicoara Talpes  wrote:


Hello Joshua Community,


I am running "cat example.SRC | ./prepare.sh | ./joshua" in a terminal on a 
Ubuntu machine, but for every line in the example.SRC, there is a hyphen shown 
on the terminal. No translation seems to be happening. I am using the 
English-German pack.


How to fix this?


Is there a FAQ that I missed where this is addressed?


Apologies for asking this question again in a matter of a few days, but I am in 
need of a response to move forward.


Thank you,


Nicoara







Re: joshua prints hyphens instead of translation

2017-07-04 Thread Matt Post
Sorry, that should be "joshua -v 1", can you show that, please?


> On Jul 4, 2017, at 12:25 PM, Nicoara Talpes  > wrote:
> 
> Hello,
> 
> Thanks for responding .
> 
> Here it is the output, I hope it clarifies a little:
> 
> nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ head 
> example.SRC
> hello 
> how are you
> nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ head 
> example.SRC | ./prepare.sh 
> hello
> how are you
> nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ head 
> example.SRC | ./prepare.sh | ./joshua -v
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 5
> at org.apache.joshua.decoder.ArgsParser.(ArgsParser.java:60)
> at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:61)
> nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ cat 
> example.SRC | ./prepare.sh | ./joshua
> -
> -
> 
> How to proceed? 
> 
> Thank you
> 
> On Tue, Jul 4, 2017 at 4:31 PM, Matt Post  > wrote:
> Hi Nicoara,
> 
> Sorry, I seem to have missed your followup question. 
> 
> Can you please debug this a little? What do you get from the following 
> commands?
> 
>   head example.SRC
>   head example.SRC | ./prepare.sh
>   head example.SRC | ./prepare.sh | ./joshua -v
> 
> matt
> 
> 
>> On Jul 3, 2017, at 5:30 PM, Nicoara Talpes > > wrote:
>> 
>> Hello Joshua Community,
>> 
>> I am running "cat example.SRC | ./prepare.sh | ./joshua" in a terminal on a 
>> Ubuntu machine, but for every line in the example.SRC, there is a hyphen 
>> shown on the terminal. No translation seems to be happening. I am using the 
>> English-German pack.
>> 
>> How to fix this?
>> 
>> Is there a FAQ that I missed where this is addressed?
>> 
>> Apologies for asking this question again in a matter of a few days, but I am 
>> in need of a response to move forward.
>> 
>> Thank you,
>> 
>> Nicoara
> 
> 



Re: joshua prints hyphens instead of translation

2017-07-04 Thread Nicoara Talpes
Hello,

Thanks for responding .

Here it is the output, I hope it clarifies a little:

nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ head
example.SRC
hello
how are you
nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ head
example.SRC | ./prepare.sh
hello
how are you
nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ head
example.SRC | ./prepare.sh | ./joshua -v
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 5
at org.apache.joshua.decoder.ArgsParser.(ArgsParser.java:60)
at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:61)
nicoara@ubuntu:~/Desktop/joshua/apache-joshua-en-de-2017-01-31$ cat
example.SRC | ./prepare.sh | ./joshua
-
-

How to proceed?

Thank you

On Tue, Jul 4, 2017 at 4:31 PM, Matt Post  wrote:

> Hi Nicoara,
>
> Sorry, I seem to have missed your followup question.
>
> Can you please debug this a little? What do you get from the following
> commands?
>
> head example.SRC
> head example.SRC | ./prepare.sh
> head example.SRC | ./prepare.sh | ./joshua -v
>
> matt
>
>
> On Jul 3, 2017, at 5:30 PM, Nicoara Talpes 
> wrote:
>
> Hello Joshua Community,
>
> I am running "cat example.SRC | ./prepare.sh | ./joshua" in a terminal on
> a Ubuntu machine, but for every line in the example.SRC, there is a
> hyphen shown on the terminal. No translation seems to be happening. I am
> using the English-German pack.
>
> How to fix this?
>
> Is there a FAQ that I missed where this is addressed?
>
> Apologies for asking this question again in a matter of a few days, but I
> am in need of a response to move forward.
>
> Thank you,
>
> Nicoara
>
>
>


Re: joshua prints hyphens instead of translation

2017-07-04 Thread Matt Post
Hi Nicoara,

Sorry, I seem to have missed your followup question. 

Can you please debug this a little? What do you get from the following commands?

head example.SRC
head example.SRC | ./prepare.sh
head example.SRC | ./prepare.sh | ./joshua -v

matt


> On Jul 3, 2017, at 5:30 PM, Nicoara Talpes  > wrote:
> 
> Hello Joshua Community,
> 
> I am running "cat example.SRC | ./prepare.sh | ./joshua" in a terminal on a 
> Ubuntu machine, but for every line in the example.SRC, there is a hyphen 
> shown on the terminal. No translation seems to be happening. I am using the 
> English-German pack.
> 
> How to fix this?
> 
> Is there a FAQ that I missed where this is addressed?
> 
> Apologies for asking this question again in a matter of a few days, but I am 
> in need of a response to move forward.
> 
> Thank you,
> 
> Nicoara