El dl 22 de 07 de 2013 a les 22:21 +0400, en/na Ilnar Salimzyan va
escriure:
>
>
> 2013/7/19 Francis Tyers <[email protected]>
> Hello all,
>
> Dávid Nemeskey, one of our GSOC students, has written an ATT
> format to
> lttoolbox binary format compiler.[1] I'd like to include this
> into
> lttoolbox as an option to lt-comp.
>
> The compiler allows us to convert transducers created with
> HFST into
> lttoolbox compatible ones. This has two benefits:
>
> 1) It means that we can get over the LRLM tokenisation bug in
> hfst-proc
>
> 2) It means that we could use HFST format transducers in
> lttoolbox-java.
>
> If there are any requests or suggestions, please let me know.
> I'll
> probably do it this weekend.
>
> Finally, I'd like to applaud Dávid for his excellent work! I
> think this
> is going to turn out to be very useful!
>
> +1 here, thanks for taking care of this issue.
>
> Ilnar
Ok, so most of the code is merged, just one last thing which is the
integration with lt-comp. I'm attaching a diff of my proposal so that
people can look over, and if there are no or suggestions, I'll apply it
this week.
=======================================================================
$ cat /tmp/transducer.att
0 1 a b
1 2 @0@ <n>
2
$ lt-comp lr /tmp/transducer.att /tmp/transducer.bin
main@standard 3 2
final@inconditional 1 0
$ echo "a a aa" | lt-proc /tmp/transducer.bin
^a/b<n>$ ^a/b<n>$ ^aa/*aa$
$ cat /tmp/empty.xml
<x></x>
$ ./lt-comp lr /tmp/empty.xml /tmp/empty.bin
Error (2): Invalid node '<x>'.
=======================================================================
Fran
Index: lt_comp.cc
===================================================================
--- lt_comp.cc (revision 45898)
+++ lt_comp.cc (working copy)
@@ -17,6 +17,7 @@
* 02111-1307, USA.
*/
#include <lttoolbox/compiler.h>
+#include <lttoolbox/att_compiler.h>
#include <lttoolbox/lttoolbox_config.h>
#include <cstdlib>
@@ -27,6 +28,16 @@
using namespace std;
+/*
+ * Error function that does nothing so that when we fallback from
+ * XML to AT&T, the user doesn't get a message unless it's really
+ * invalid XML.
+ */
+void errorFunc(void *ctx, const char *msg, ...)
+{
+ return;
+}
+
void endProgram(char *name)
{
if(name != NULL)
@@ -47,7 +58,9 @@
int main(int argc, char *argv[])
{
+ char ttype = 'x';
Compiler c;
+ AttCompiler a;
c.setVerbose(false);
#if HAVE_GETOPT_LONG
@@ -133,6 +146,27 @@
break;
}
+ xmlTextReaderPtr reader;
+ reader = xmlReaderForFile(infile.c_str(), NULL, 0);
+ xmlGenericErrorFunc handler = (xmlGenericErrorFunc)errorFunc;
+ initGenericErrorDefaultFunc(&handler);
+ if(reader != NULL)
+ {
+ int ret = xmlTextReaderRead(reader);
+ if(ret != 1)
+ {
+ ttype = 'a';
+ }
+ xmlFreeTextReader(reader);
+ xmlCleanupParser();
+ }
+ else
+ {
+ ttype = 'a';
+ }
+ initGenericErrorDefaultFunc(NULL);
+
+
if(opc == "lr")
{
if(vr == "" && vl != "")
@@ -144,7 +178,14 @@
{
c.parseACX(acxfile, Compiler::COMPILER_RESTRICTION_LR_VAL);
}
- c.parse(infile, Compiler::COMPILER_RESTRICTION_LR_VAL);
+ if(ttype == 'a')
+ {
+ a.parse(infile, Compiler::COMPILER_RESTRICTION_LR_VAL);
+ }
+ else
+ {
+ c.parse(infile, Compiler::COMPILER_RESTRICTION_LR_VAL);
+ }
}
else if(opc == "rl")
{
@@ -153,7 +194,15 @@
cout << "Error: -r specified, but mode is rl" << endl;
endProgram(argv[0]);
}
- c.parse(infile, Compiler::COMPILER_RESTRICTION_RL_VAL);
+ if(ttype == 'a')
+ {
+ a.parse(infile, Compiler::COMPILER_RESTRICTION_RL_VAL);
+ }
+ else
+ {
+ c.parse(infile, Compiler::COMPILER_RESTRICTION_RL_VAL);
+ }
+
}
else
{
@@ -166,6 +215,13 @@
cerr << "Error: Cannot open file '" << outfile << "'." << endl;
exit(EXIT_FAILURE);
}
- c.write(output);
+ if(ttype == 'a')
+ {
+ a.write(output);
+ }
+ else
+ {
+ c.write(output);
+ }
fclose(output);
}
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff