There is not enough information for me to answer your question. I don't see any problems.
$ echo "i'll give you 10% of the asking price" | ./prepare.sh | ./joshua I'll give you 10 % of the asking price > On Dec 16, 2016, at 3:22 AM, Aliaksei Rudak <alru...@gmail.com> wrote: > > Also there is a problem with parsing (%) sign in sentences. Do you know how > to solve this ? > > 2016-12-15 10:57 GMT+03:00 Aliaksei Rudak <alru...@gmail.com > <mailto:alru...@gmail.com>>: > Hi Matt, > > English-Russian language pack has broken link > https://cwiki.apache.org/confluence/home.apache.org/~lewismc/language-pack-en-ru-2016-10-28.tar.gz > > <https://cwiki.apache.org/confluence/home.apache.org/~lewismc/language-pack-en-ru-2016-10-28.tar.gz> > > When do you plan to create and upload other languages ? > > Regards, > Alexei > > 2016-12-14 21:50 GMT+03:00 Matt Post <p...@cs.jhu.edu > <mailto:p...@cs.jhu.edu>>: > 1. If you download Joshua from GitHub, and run "download_dependencies.sh", it > builds KenLM and the KenLM library. If you can do that, that is all you need > to do. > > 2. http://opus.lingfil.uu.se <http://opus.lingfil.uu.se/> is a great place to > get parallel data; it's where we got all the data we use. > > 3. Joshua has a Java API (undocumented) but not a C++ one. > > >> On Dec 14, 2016, at 10:30 AM, Aliaksei Rudak <alru...@gmail.com >> <mailto:alru...@gmail.com>> wrote: >> >> 1) Can you estimate approximate date of releasing language packs with kenlm >> model ? I have a teammate who know c++ well so If we have more information >> (or tutorial) of how to do that by ourselves we can share the result with >> others. So it will be benefit for all. >> >> 2) Where can I get or buy parallel corpora for other languages ? Where did >> you get data for current huge language packs? I found several sources but >> they so small in size. >> >> 3) Is there any document of how to create offline translation system based >> on Joshua and make it as c++ library for example ? >> >> >> >> >> 2016-12-14 14:33 GMT+03:00 Matt Post <p...@cs.jhu.edu >> <mailto:p...@cs.jhu.edu>>: >> 1. the lm cannot be used with moses. we have berkeleylm format you need >> kenlm. we are releasing kenlm soon. kenlm is better but it requires the user >> to compile c++ code which can be tricky. >> >> 2/3. please see the README in each language pack. you need to pass input >> text through "prepare.sh" which does tokenization. >> >> matt (from my phone) >> >> Le 14 déc. 2016 à 06:16, Aliaksei Rudak <alru...@gmail.com >> <mailto:alru...@gmail.com>> a écrit : >> >>> Hi Matt, >>> Thanks for answers. >>> >>> 1) Can language models inside Joshua language packs work with Moses MT ? If >>> yes - can you give me the link how to run them on it ? >>> >>> 2) I installed several instances (German, Spanish, Russian) and all of them >>> have the same strange issue. Trying to translate one sentence. >>> >>> For example from Spanish to English >>> "Además podrás encontrar las audiciones de los textos con distintos acentos >>> del español. " >>> >>> Translates as >>> "Also auditions, you'll find texts with different accents of español" >>> >>> It means that one word in sentence (español) is not translated correct. But >>> it's ok if you translating single word ( español ) >>> >>> Same for other languages (German, Russian). All words (except one or >>> sometimes 2 words) are not translated. Do you know how to fix this ? >>> >>> 3) How to translate sentences with punctuation marks (comma, exclamation, >>> question marks etc) ? >>> >>> Translating from Spanish to English gives error >>> "¿Se puede aprender a escribir? ¿El escritor nace o se hace? La vieja >>> pregunta." >>> >>> If you try to translate words separated with commas it not translates these >>> words >>> "inglés, francés, alemán y portugués" >>> >>> output >>> "Inglés, francés, german and portuguese" >>> >>> Regards, >>> Alexei >>> >>> >>> >>> >>> >>> 2016-12-13 17:44 GMT+03:00 Matt Post <p...@cs.jhu.edu >>> <mailto:p...@cs.jhu.edu>>: >>> >>>> On Dec 12, 2016, at 3:04 PM, Aliaksei Rudak <alru...@gmail.com >>>> <mailto:alru...@gmail.com>> wrote: >>>> >>>> 1) If English-German pair will be recompiled to German-English >>>> (vice-versa) do I need a separate instance to process back translation ? >>>> Or they can work in one instance in both directions ? >>>> >>> A whole new model needs to be trained. You need a separate model for each >>> direction. >>>> 2) Are there any documents about how to recompile model to work vice-versa >>>> from German-English to English-German ? >>>> >>>> At this page under the “Project Info” title links “Community page” and >>>> “Current Documentation” not working >>>> >>>> http://incubator.apache.org/projects/joshua.html >>>> <http://incubator.apache.org/projects/joshua.html> >>> This document on running the pipeline: >>> >>> >>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65871630 >>> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65871630>3) >>> Are there ways of increasing translation quality without changing >>> (extending) language model? >>>> >>>> At this page under “How do I make Joshua produce better results? at second >>>> option (Joshua directly) link not working >>>> >>>> http://joshua.incubator.apache.org/6.0/faq.html >>>> <http://joshua.incubator.apache.org/6.0/faq.html> >>> Yes but it's complicated. The best way is to add data, but there are lots >>> of other models and parameter variations that could be tried. >>> >>>> 4) How can I reduce the amount of memory each language pair instance use >>>> without losing process speed and quality? >>>> >>> If you can find German–French parallel data, use that. Otherwise, pivot >>> through another language. >>>> 5) To make translation from German to French do I need to make translation >>>> via English conversion ? (like German to English first and then English to >>>> French) >>>> >>>> I mean for the case without German-French parallel data. >>>> >>>> >>>> >>>> >>>> >>>> Regards, >>>> >>>> Alexei >>>> >>>> >>>> >>>> >>>> >>>> >>>> 2016-12-12 17:58 GMT+03:00 Matt Post <p...@cs.jhu.edu >>>> <mailto:p...@cs.jhu.edu>>: >>>> No, each has to be run separately. But not all are equally good, so I >>>> suggest starting with a few and building up. >>>> >>>> If you get KenLM working in place of BerkeleyLM, the language models will >>>> be shared between them if they are on the same machine. I will post >>>> instructions soon. >>>> >>>> Yes, each one has two language models that are interpolated. >>>> >>>> >>>> >>>>> On Dec 12, 2016, at 9:20 AM, Aliaksei Rudak <alru...@gmail.com >>>>> <mailto:alru...@gmail.com>> wrote: >>>>> >>>>> Hi Matt, >>>>> >>>>> You was right about increasing memory. Spanish works fine now but need >>>>> about 16GB to run. Is it possible to use one Joshua instance for all >>>>> language pairs simultaneously ? Right now I use one instance for each >>>>> pair at it takes about 4GB, so for all 60 languages I need 240 GB of RAM >>>>> memory and 60 running instances. But may be it's possible to process all >>>>> language translation with one instance and use for example 32 GB ? >>>>> >>>>> Also I found that every language pair archive has 2 language models ( >>>>> Berkeley and KenLM ) Do I need them two at once ? Or Joshua selects one >>>>> of them depending on some parameters ? >>>>> >>>>> Regards, >>>>> Alexei >>>>> >>>>> >>>>> >>>>> >>>>> 2016-12-07 15:51 GMT+03:00 Matt Post <p...@cs.jhu.edu >>>>> <mailto:p...@cs.jhu.edu>>: >>>>> I fixed the Czech link. >>>>> >>>>> For Spanish–English, what is the error? I imagine you have to provide >>>>> more memory. Edit the "joshua" script and double or triple the amount of >>>>> memory. >>>>> >>>>> >>>>>> On Dec 7, 2016, at 7:14 AM, Aliaksei Rudak <alru...@gmail.com >>>>>> <mailto:alru...@gmail.com>> wrote: >>>>>> >>>>>> Hi Matt, >>>>>> >>>>>> Can you check Czech-English language pack, it has broken link. >>>>>> Spanish-English pair not works, throws exceptions >>>>>> >>>>>> >>>>>> Regards, >>>>>> Alexei >>>>>> >>>>>> 2016-11-28 17:30 GMT+03:00 <alru...@gmail.com >>>>>> <mailto:alru...@gmail.com>>: >>>>>> Hi Matt, what time (total price ) will be to record video of how to make >>>>>> translation vice-versa (from german to english) to english to german >>>>>> pair >>>>>> >>>>>> Regards, >>>>>> Alexei >>>>>> >>>>>> On Nov 28, 2016, at 17:59, Matt Post <p...@cs.jhu.edu >>>>>> <mailto:p...@cs.jhu.edu>> wrote: >>>>>> >>>>>>> Inline below: >>>>>>> >>>>>>>> On Nov 26, 2016, at 11:12 AM, Aliaksei Rudak <alru...@gmail.com >>>>>>>> <mailto:alru...@gmail.com>> wrote: >>>>>>>> >>>>>>>> Hi Matt, >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> We need to prepare all infrastructure now so you can make changes in >>>>>>>> future. Preparation will take time. Right now I have several questions >>>>>>>> about all this things. >>>>>>>> >>>>>>>> 1) Does Joshua has language auto-detect feature ? If yes – how to use >>>>>>>> it? If not – is it hard to do it ? >>>>>>>> >>>>>>> This feature is called LID ("language ID"). It is not in Joshua >>>>>>> currently but we have talked about it, and it wouldn't be too difficult >>>>>>> to add in. >>>>>>>> 2) On this page >>>>>>>> >>>>>>>> https://cwiki.apache.org/confluence/display/JOSHUA/Notes+on+Language+Pack+Creation >>>>>>>> >>>>>>>> <https://cwiki.apache.org/confluence/display/JOSHUA/Notes+on+Language+Pack+Creation> >>>>>>>> In first sentence there is link to “Corpus” at the end where language >>>>>>>> datasets should be located, but when I clicked on link it gives me >>>>>>>> English-German pack to download. >>>>>>>> >>>>>>>> Is it correct behavior ? if not – can you give the link to such >>>>>>>> datasets >>>>>>>> >>>>>>> Sorry, the link should go to http://opus.lingfil.uu.se >>>>>>> <http://opus.lingfil.uu.se/>. I just fixed it. >>>>>>>> 3) Can you record a video of your screen of how to recompile language >>>>>>>> pair to translate vice-versa ? To make English-German pair to >>>>>>>> translate from German to English ? >>>>>>>> >>>>>>>> Can I pay for such video without contract now (or I can mark paypal >>>>>>>> payment for example that I’m paying for your assistance)? >>>>>>>> >>>>>>>> Because we need to make initial setup of all system and check how much >>>>>>>> assistance we need , when and where. >>>>>>>> >>>>>>>> 4) What kind of contract and conditions do you prefer ? What is your >>>>>>>> hourly rate ? >>>>>>>> >>>>>>> I still have to confirm with my employer that I am allowed to engage in >>>>>>> outside work. My hourly rate is $250. I would give you estimates ahead >>>>>>> of time so you could know what it would cost you. >>>>>>> >>>>>>> If that sounds good to you, can you clarify for me who the money would >>>>>>> be coming from? If it's a company, what is the name of the company, and >>>>>>> where is it incorporated? If it's a person, what is their name, and >>>>>>> what is their citizenship? I would need this information for my own tax >>>>>>> purposes. >>>>>>> >>>>>>> Sincerely, >>>>>>> Matt >>>>>>> >>>>>>>> Regards, >>>>>>>> >>>>>>>> Alexei >>>>>>>> >>>>>>>> >>>>>>>> 2016-11-24 19:14 GMT+03:00 Aliaksei Rudak <alru...@gmail.com >>>>>>>> <mailto:alru...@gmail.com>>: >>>>>>>> yes, ok, >>>>>>>> >>>>>>>> Skype chat at 9 AM EST on Friday, November 25 >>>>>>>> >>>>>>>> 2016-11-24 18:33 GMT+03:00 Matt Post <p...@cs.jhu.edu >>>>>>>> <mailto:p...@cs.jhu.edu>>: >>>>>>>> Great, let's chat at 9 AM EST? >>>>>>>> >>>>>>>> >>>>>>>>> On Nov 23, 2016, at 4:31 PM, Aliaksei Rudak <alru...@gmail.com >>>>>>>>> <mailto:alru...@gmail.com>> wrote: >>>>>>>>> >>>>>>>>> I sent you skype request. Let's plan on Friday when you have a free >>>>>>>>> time. As for me I can from 8-00 till 16-00 (EST) anytime. I will >>>>>>>>> deploy all things from local machine to some service (will select it >>>>>>>>> tomorrow ) and send you access. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Alexei >>>>>>>>> >>>>>>>>> 2016-11-24 0:08 GMT+03:00 Matt Post <p...@cs.jhu.edu >>>>>>>>> <mailto:p...@cs.jhu.edu>>: >>>>>>>>> I am mpost89 — what time works for you? I have a little time now, >>>>>>>>> otherwise not till Friday. I am in EST time zone. >>>>>>>>> >>>>>>>>> matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Nov 23, 2016, at 3:28 PM, Aliaksei Rudak <alru...@gmail.com >>>>>>>>>> <mailto:alru...@gmail.com>> wrote: >>>>>>>>>> >>>>>>>>>> Hi Matt, >>>>>>>>>> >>>>>>>>>> My skype is "alrudak". Can we talk with voice and discuss all >>>>>>>>>> details ? >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> Alexei >>>>>>>>>> >>>>>>>>>> 2016-11-23 22:41 GMT+03:00 Matt Post <p...@cs.jhu.edu >>>>>>>>>> <mailto:p...@cs.jhu.edu>>: >>>>>>>>>> That should be fairly easy to do. What about running them as Amazon >>>>>>>>>> AMI instances? Or do you want them to run on your own servers? Would >>>>>>>>>> docker containers suffice? >>>>>>>>>> >>>>>>>>>> This might be something I could do for you. Can you give me more >>>>>>>>>> information about your company? >>>>>>>>>> >>>>>>>>>> matt >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On Nov 23, 2016, at 9:48 AM, Aliaksei Rudak <alru...@gmail.com >>>>>>>>>>> <mailto:alru...@gmail.com>> wrote: >>>>>>>>>>> >>>>>>>>>>> I'm trying to create my own translation service like Google >>>>>>>>>>> Translate with api to use it from my mobile apps (as clients) or as >>>>>>>>>>> web-site where you can enter phrase for translation (like Google >>>>>>>>>>> did). You told that "Google-translate-style API" is already >>>>>>>>>>> presented in server mode, how can I use it ? >>>>>>>>>>> >>>>>>>>>>> I was able to install server and download one language pair to >>>>>>>>>>> test. For example English - German. Does this language pair can do >>>>>>>>>>> translation only in one direction (English German) and not vice >>>>>>>>>>> versa (From German to English)? If it's possible to translate vice >>>>>>>>>>> versa how can I do this ? >>>>>>>>>>> >>>>>>>>>>> If someone can help me on paid basis - please give it's contacts. >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> Alexei >>>>>>>>>>> >>>>>>>>>>> 2016-11-23 16:21 GMT+03:00 Matt Post <p...@cs.jhu.edu >>>>>>>>>>> <mailto:p...@cs.jhu.edu>>: >>>>>>>>>>> 1. Yes, you can translate as much as you'd like. Do you mean lots >>>>>>>>>>> of sentences or long sentences? >>>>>>>>>>> >>>>>>>>>>> 2. Yes, that is what it does. It even offers (in server mode) a >>>>>>>>>>> Google-translate-style API. >>>>>>>>>>> >>>>>>>>>>> 3. There may be someone interested in helping you. What exactly are >>>>>>>>>>> you trying to do? What do you mean "all" language pairs? >>>>>>>>>>> >>>>>>>>>>> > On Nov 19, 2016, at 2:08 PM, Aliaksei Rudak <alru...@gmail.com >>>>>>>>>>> > <mailto:alru...@gmail.com>> wrote: >>>>>>>>>>> > >>>>>>>>>>> > Hi Matt, >>>>>>>>>>> > Can you help me and ask several questions about Joshua project ? >>>>>>>>>>> > >>>>>>>>>>> > 1) Is it possible to translate big amounts of text with Joshua ? >>>>>>>>>>> > ( For example 1000 characters per transaction) >>>>>>>>>>> > 2) Does Joshua works like Google Translate ? So you can put >>>>>>>>>>> > sentence in one language and get translated in another language ? >>>>>>>>>>> > 3) Can you (or your teammates ) help me with deployment Joshua >>>>>>>>>>> > on my server and setup all language pairs ? I will pay you. >>>>>>>>>>> > >>>>>>>>>>> > Regards, >>>>>>>>>>> > Alexei >>>>>>>>>>> > >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>> >>> >> > > >