Hi Tiago! On 08/02/2016 14:02, Tiago Tresoldi wrote: > for me, it is a "go ahead", too. In fact, I'd go as far as saying that it is > not that bad if version 2.0 is to some level incompatible with version 1.0: > it means we can go > forward and update an almost 20-year-old code according to new practices and > new architectures.
I agree with your comments as well. > A glib-like option handling is very desirable, and the options themselves > need consistency. I always prefer to have long options too (the ones usually > preceded by two > dashes), but we can work on that later. I also like long options more, I think it would not be very difficult to implement. Actually it was my first attempt to write an option interface that works like the glib one and I was more concentrated in implementing the core functionality rather than adding support for all the feature that I would like to see in it. > I have some free days now (we are in the carnival holidays in Brazil), I'll > check your changes and try to contribute. This would be great. Especially if you will be able to spot the but that I introduced in met. Then if you can pick up some of the elements in the TODO list (or add some elements as well) it would be very welcome. > As we are discussing, some ideas: > > - do you think it is a good idea to move from SourceForge? I guess the best > alternative would be GitHub, keeping the homepage and this mailing list for > the time being. As long as git is used, I am ok with every solution. I constantly use GitHub, Bitbucket and Sourceforge. Recently I was talking with a guy that was interested in hunpos (https://github.com/mivoq/hunpos/) and told him about acopost. He immediately asked about GitHub. So, if you think it is good to move to GitHub, I am perfectly fine with it. One feature that I like of GitHub is that it is possible to see the network graph of all the people that forked the repository. One thing I really dislike are pull-requests, that are unnecessarily convoluted, but I can live with them (and already used to them). > - we have already discussed the "voting system", where multiple taggers are > run, their results collected and the best one is selected ("voting" is not > the best description, > as there could be neural networks, hard-coded rules, etc.); do you think it > is best to code it in C or we should explore some alternative, like Lua? > While I'd like the > flexibility of a scripting language (for the users, too), I am not very > favorable to the idea of embedding a full language or having such kind of > dependency. I believe that > being pure-C and no-dependency system is one of the strengths of acopost. I would prefer to stick to pure-C, if it is going to be a usable tagger or a framework. My main goal is to have a library that I can use in a program and maintaining a C-library is much more convenient than maintaining anything else (e.g., ABI compatibility is easy to assess, there is no runtime that must be loaded and can interfere with other libraries, memory usage is predictable, ...). On the other end I think that we should at least think about introducing some dependency or carefully plan alternatives. I understand that it is easier to bootstrap a system without any dependency, but I would like to see UTF-8 support properly implemented and it is very very difficult to implement properly, while libICU (http://site.icu-project.org/) is already there (although it is suboptimal that internally it is using UTF-16). Another option, that will probably be more appropriate for a library, is to avoid "lowercase" transformation and implement a normalization mechanism based on callbacks. In this way the burden to deal with UTF-8 is delegated to the library user and we do not need to add libICU dependency. In my typical project I usually end up having several implementations of the Viterbi algorithm, because most projects thinks it is a good feature to avoid dependencies. Again I think we could at least think about using something like https://github.com/Sleepwalking/libgvps. Cheers, Giulio. > 2016-02-04 23:23 GMT-02:00 Giulio Paci <giuliop...@gmail.com > <mailto:giuliop...@gmail.com>>: > > Hi to all! > > On 01/02/2016 21:27, Giulio Paci wrote: > > Il 01/feb/2016 21:12, "Ulrik Sandborg-Petersen" > <ulr...@scripturesys.com <mailto:ulr...@scripturesys.com> > <mailto:ulr...@scripturesys.com <mailto:ulr...@scripturesys.com>>> ha scritto: > >> Secondly, I think the proposed changes to the options are good, so > from me it is a "go ahead" on the options. > > > > Perfect. > > @Tiago: any opinion? > > I finally decided to push my current master branch. > The work is not complete and there are a few bugs, but as I do not know > if I will be able to update the code again in the near future, I prefer to > share it and tell you > what should be fixed. > > 1) Updating the CLIs of commands, I decided that the same option should > be associated to almost the same meaning in every command. So I had to rename > a several of them in > several commands. However I did not update the documentation yet. > > Here follows the list of CLI changes, so that it is possible to update > old command lines: > > acopost-et, acopost-t3, acopost-tbt, acopost-met: > more strict separation between options and other parameters > collapsing multiple options into one is not supported anymore > -- can be used to explicitly terminate options > acopost-et: > added -h > -t => -o test > lexiconfile => -l lexiconfile > acopost-tbt: > added -h > -r => -R > -n => -r > -o accepts tag, test and train in addtition to 0, 1 and 2 > acopost-t3 > -u => -Z > lexiconfile => -l lexiconfile > -q => -v 0 > -t => -o test > -d => -o debug > -m => -o > -l => -L > acopost-met: > -c => -o > -s => -C > -m => -P > -t => -M > -p => -K > acopost-lex2theta: > added -h > added -r <int> > lexiconfile => -l lexiconfile > acopost-complementary-rate: > -q => -v 0 > acopost-evaluate: > -v => -v 1 (this is the default now) > -i => -C > acopost-split-corpus: > -v => -v 1 (this is the default now) > -m => -k > -p => -F > acopost-cooked2fntbl: > -v => -v 1 (this is the default now) > acopost-interchange-matrix: > -q => -v 0 > acopost-mean-and-sd: > -s => -D > acopost-cooked2wtree: > -e => -X > -i => -I > -d => -o debug > -a => -A > -b => -B > > 2) I introduced a BUG in met, that breaks tagging functionality (in > viterbi mode). The BUG has been introduced after > ce310074f1b194f192cec0cf4822bb8ec7b87e78 (I checked and > it is producing much more reasonable results) and is probably related to > the lowercase function replacement; > > 3) Using -n option on met gives segmentation fault in my environment. I > did not yet investigate the cause. > > Bests, > Giulio > > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > _______________________________________________ > acopost-devel mailing list > acopost-devel@lists.sourceforge.net > <mailto:acopost-devel@lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/acopost-devel > > > > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > > > > _______________________________________________ > acopost-devel mailing list > acopost-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/acopost-devel > ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 _______________________________________________ acopost-devel mailing list acopost-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/acopost-devel