On Saturday 31 March 2007 22:17, Marcin Miłkowski wrote:
> Hi Graham,
>
> > Warned?  about what?  If the Wordnet list is Opensource then what is the
> > issue?
>
> I understood you want to start from _scratch_. 

I never said that.  In fact I said exactly the opposite.  I'm not sure how you 
could have drawn that inference, I apologise if I am not being clear enough.

> Wordnet was sponsored by
> a 20-million USD grant, and done by a team of really qualified
> linguists. And it is one of the biggest achievements in computer
> linguistics as such. So you know that trying to beat that requires a lot
> of resources, and IT resources are really not so important.

But as I keep saying, and as many others keep saying and repeating the OOo 
thesaurus that this is based on is seriously substandard especially in 
en_GB/NZ/AU/ZA

>
> If you want to extend Wordnet, then it's another story. Of course, it's
> easier to do so.
>
> >> I'd recommend searching for local English Wordnets (or similar
> >> linguistic projects), maybe there are Australian versions.
> >
> > I've already done that, those that I've seen are small operations run by
> > a single enthusiast on a small backroom server that has a single point of
> > failure.
>
> The server is a minor issue. The major issue is how to start a team -
> single enthusiasts could never achieve that with no remuneration.

Excuse me if I'm beginning to sound frustrated,  as I already explained, we 
have a team of people, paid people, that are willing to administer the setup.  
They have project managers who will gather the linguists.  What you are 
talking about is frankly rather trivial in a corporate space. Assembling 
project teams is a daily task in this environment  You are seeing barriers 
where there are none.

>
> >> Trying to
> >> build a new thesaurus from scratch is simply futile.
> >
> > It's a good thing Mr Roget didn't think that.
>
> But it's not 19th century anymore. Roget's thesaurus is really worse
> than Wordnet in linguistic terms. 

But he didn't think it was futile to try and that was the point.  It's about 
atitude and nothing to do with comparisons of quality. 

> And in linguistics, you try to
> bootstrap and reuse the data.

Of course, that is why I said we build from the en_GB wordnet if that is what 
it requires.

>
> > If therefore Openthesaurus is a bad option, the assumption I take from
> > what you are saying is; setting up a local Wordnet is the best
> > alternative.
>
> It is not a bad option. I'm using OpenThesaurus myself. But if you want
> to reuse Wordnet, you need to convert it into OpenThesaurus, and this is
> a non-trivial task.

OK we seem to have gone in a circle. But the point is it's doable with the 
right skills.  

>
> You cannot setup a local Wordnet without any software as Wordnet is only
> a file. You need an editing environment. You can use some other software
> (there are many software packages for professional linguists - used for
> building national wordnets - but they could be far too complicated for
> an average user).

There you go again, barriers.  Forget the barriers, I would like to see 
solutions.

Problem:  OOo English thesaurus is demonstrably substandard

Solution:  I don't know, that's why I'm asking, because we have a substantial 
benefactor who is willing to commit resources, both financially and 
materially to a thesaurus project and who I would like to be able to put in 
touch with people in the OOo community that are more knowledgable in this 
area than I.    


>
> > What is the step from Wordnet Database to installed Thesaurus in OOo?
>
> Conversion of the database. Take a look at scripts at Daniel Naber's site.
>
> But note that this conversion does not allow any direct edition Wordnet
> nor edition of OOo thesaurus.

Sorry, I'm not sure what you mean here, I'll look at Daniels scripts. 

>
> > Where can I find someone who can exchange emails with my clients people
> > to get them under way
>
> No idea. First you have to find someone who has some natural language
> processing background, and is able to make mapping between Wordnet's
> relations to MySQL database in OpenThesaurus, and make a decision
> whether some of the relations are to be discarded or ported into
> OpenThesaurus software. I wouldn't start a project without finding the
> person who is able to do that - these processes are non-trivial.

As I already said, the technical skills are available.  Once the client knows 
what skillset is needed he will assign the person most suitable for the job.

>
> I would recommend you to contact linguistics (NLP) departments at
> Australian universities. 

Why Australia, I'm not Australian and surprising as it may seem, New Zealand 
does have Universities.

>This is a task that make a good postgraduate work.
>
> Regards,
> Marcin
>

-- 
Graham Lauder,

INGOTs Assessor Trainer
Moderator New Zealand
(International Grades in Office Technologies)
www.theingots.org
www.theingots.org.nz

OpenOffice.org MarCon (Marketing Contact) NZ
http://marketing.openoffice.org/contacts.html

www.ooogear.co.nz

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to