2013/5/30 Giulio Paci <giuliop...@gmail.com>

> >From an acopost perspective, I think the structure that is now in place
> is fine. With respect to what I usually do when I can decide what to do,
> the only difference is that scripts are now in /bin instead of
> /src/scripts.
>

Moving them to /src/scripts would make sense, IMO. I was not confortable
with /bin holding just scripts, which is why I had thought of moving the
compiled programs into it. But removing /bin entirely seems more
appropriate given that, as per below, the compiled program will stay in
/src.


> I think compiled programs should stay where autoconf/automake decides to
> put them by default: I do not see any benefit in changing that default.
>
> Alas, my inexperience: I was under the impression that the default was to
put them in /bin, and that putting them in /src was the change. As always,
I would go for the standard practices unless we have reason not to or
unless it is not clear what the standard would be.


>
> > /data -- default files (tbt and et rules)
> > /data/{language} -- pre-trained models, such as Marco Baroni's Italian
> > and Ulrik's pre-1948 Danish; as a complete set of trained models is not
> > complex, everything should ideally be kept in a single directory
>
> I like the name /examples, because it makes it clear they are just
> examples. Unless we are planning to release data as well. In this case I
> think we should setup a sub-module for data and separate clearly data
> from software.
>
> I think that what I was suggesting to put in /data (such as model tbt and
et rules) should be distributed in the main package, installing it in
/usr/share/acopost. Even reading the source, it is far from obvious how
such "configuration" files are expected to be, and as our end user will
probably be training his/her own models, it is more practical to distribute
them along with the binaries than pointing in the documentation an
additional package to download.

I had suggested putting everything in /data just to reduce the number of
directories and to be consistent (/bin with binary files, /src with source
files, /data with user-experience files, etc.)

But sub-modules for the actual language models, such the available Italian
and Danish, is a great idea; they could even include a wrapper to the
voting tagger which would automatically load the correct model files (so
that a user could do, say, "sudo apt-get install acopost-english && echo
"This is just an example." | acopost-english").


> > /maintain -- development&debug scripts, C testing, etc.
> > /maintain/data -- data for testing, fake language corpus and eventually
> > its pre-trained models, etc.
>
> In my opinion we should put in /maintain what is interesting only for
> package maintainers, the main purpose of this directory should be to
> avoid source tree pollution with scripts and stuff that are useful only
> to a few of us.
>

Maybe a /maintain for package maintainers and a /devel for those who want
to compile, study and extended acopost, but who don't care about the
packaging? (but see answer below)


> /tests serve a completely different purpose: it contains tests that are
> useful to check acopost integrity on the end-user machine: the data
> should be distributed and should be used as a standard step of the
> installation procedures.
>

Which pretty much sums up my idea of the /devel directory. So, are we
decided for a /maintain and a /tests?


> > /src -- should only keep the source of the distributed programs;
> > acopost_test should be moved to  /maintain
>
> Probably true. I have not yet understood what is the purpose of
> acopost_test, but I think it is probable that it would be useful to make
> it part of the test suite as well, so /test/src or even /src would be a
> good place for it. We only need to make sure that "make install" does
> not install it.
>

First of all, I had forgot lextest.c, which is similar to acopost_test.c
and would follow it.

The idea of acopost_test is to test/stress (and in a ugly way document) the
"library" of acopost (mem.c, array.c, hash.c...), before I actually test
the taggers. I know from experience, for example, that `met` was always
coredumping when the number of tags was >= 40 or 50, which looked more like
a matter of memory leaking than memory shortage. I would also like to test,
for exemple, if there are better hashing functions that Knuth's, given our
characteristics (strings are usually very short, tags are usually very
alike, etc.), if we can work better with the collisions in hash.c, etc. In
short, it is supposed to be both a workbench and a unit testing (in the
long run). It is what I usually do, but, once more, I am certainly not your
regular C-guru... ;)

But I now agree with Ulrik about keeping it where it is: moving
acopost_test.c would also mean lots of ugly "../src" in the #includes.
Let's keep it the way it is, making sure that it is not installed.


>
> > BRW, I am not completely sure about the name `maintain`, perhaps we
> > could go for `devel`.
>
> /devel is a good name for it. But unless we ALL agree to move
> maintainance scripts there, I do not think it is a priority to have this
> directory as we only have 3 scripts that are candidates to move there
> and, as Ulrik reported, autogen.sh usually resides in the top directory
> of the source tree.
>

I agree, autogen.sh should reside in the top directory. Extending what I
said above, it would be matter of having one directory named /maintain and
a second one named either /devel or /tests, if you agree in separating the
"packaging" from the "hacking around". For the time being, however, I would
keep it the way it is: acopost_test and company in /src, the scripts in the
top directory.

Bests,

Tiago
------------------------------------------------------------------------------
Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
Get 100% visibility into your production application - at no cost.
Code-level diagnostics for performance bottlenecks with <2% overhead
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap1
_______________________________________________
acopost-devel mailing list
acopost-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/acopost-devel

Reply via email to