On Tue, 5 Jun 2007, Sebastian Hagen wrote:
...
Further optimization wouldn't be easy. While eating this amount of memory
isn't nice, I expect that it shouldn't pose a problem on the typical build
system, and it doesn't need to run for very long in any event.

Thanks for your constant effort in bringing back dict-wn.

I updated my inofficiel repository at

      http://people.debian.org/~tille/packages/wordnet/

with your latest code and hardcoded the arguments for simplicty.
That way I got:

...

LC_ALL=C \
        python wordnet_structures.py \
                 ../../dict/dbfiles/index.adv  ../../dict/dbfiles/data.adv \
                 ../../dict/dbfiles/index.adj  ../../dict/dbfiles/data.adj \
                 ../../dict/dbfiles/index.noun ../../dict/dbfiles/data.noun \
                 ../../dict/dbfiles/index.verb ../../dict/dbfiles/data.verb
Opening index file '../../dict/dbfiles/index.adv'...
Opening data file '../../dict/dbfiles/data.adv'...
Parsing index file and data file...
Opening index file '../../dict/dbfiles/index.adj'...
Opening data file '../../dict/dbfiles/data.adj'...
Parsing index file and data file...
Opening index file '../../dict/dbfiles/index.noun'...
Opening data file '../../dict/dbfiles/data.noun'...
Parsing index file and data file...
Opening index file '../../dict/dbfiles/index.verb'...
Opening data file '../../dict/dbfiles/data.verb'...
Parsing index file and data file...
All input files parsed. Writing output to index file 'wn.index' and data file 
'wn.dict'.
All done.
LINESWRONG=`grep "^.\{73,\}" wn.dict | wc -l` ; \
            if [ ${LINESWRONG} -gt 0 ] ; then \
                echo "${LINESWRONG} lines to long in wn.dict.  
wordnet_structures.py shoul
d be fixed." ; \
                exit -1 ; \
            fi
2 lines to long in wn.dict.  wordnet_structures.py should be fixed.


I have no idea how strict we should be about the length of lines
in the output.  I just adopted this check from the previous maintainer.
Moreover the problem is just in the starting comment:

$ grep "^.\{73,\}" wn.dict
the following copyright notice and statements, including the disclaimer, WordNet 2.1 Copyright 2005 by Princeton University. All rights reserved.

that is contained in all input files.  What is your opinion about this?
I think I will change the check to

        if [ ${LINESWRONG} -gt 2 ] ; then

which leaves us alone with this stuff and if wordnet_structures.py
would really produce several to long lines we would notice this anyway.

Kind regards

         Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to