Re: advice needed on Lingua::Identification

2004-03-09 Thread jac
 I'd much prefer ::Identify.

I'm starting to agree with your idea :-)

jac

-- 
Jose' Alves de Castro


Re: advice needed on Lingua::Identification

2004-03-09 Thread jac
First of all, thanks for your opinion :-)

 If it's simple to remove the data for each language to a config and use identical 
 logic for all languages, that's usually the way to go.

The logic (at least initially) will be the same for each and every language.

 There's also a combination of the two. For example, perhaps it makes sense to 
 combine Latin based languages together.

I did implement the notion of language sets (the user is able to enable and disable 
languages to search within, but also to enable and disable sets of languages). The 
problem arises when a language is contained in more than one set; English, for one, 
would be in the 'European' set, in the 'European Community' set, and possibly in the 
'Most Common' set ;-)

If I were to store all the information of languages in a single file (user editable), 
where should I place it? :-| (considering I would want any user to be able to use it 
without the need of a local copy and that any user should be able to include his own 
languages) ... That would take two files, right? One for everyone and another one in 
each user's home directory...

Does the notion of 'home directory' arise a problem when it comes to other OS's? :-| I 
have only used Perl with Unix systems.

jac


Re: advice needed on Lingua::Identification

2004-03-02 Thread A. Pagaltzis
* [EMAIL PROTECTED] [EMAIL PROTECTED] [2004-02-26 17:29]:
 I'm putting together some things I have and creating a module
 named Lingua::Identification.

Just a comment on the name: personally, I'd much prefer
::Identify. It's half as long, says exactly the same thing, and
it feels less awkward to have the second word be a verb operating
on the first.

-- 
Regards,
Aristotle
 
If you can't laugh at yourself, you don't take life seriously enough.


Re: advice needed on Lingua::Identification

2004-02-27 Thread khemir nadim
Hi,
[EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 One possibility would be to force each known language to become a module
 (Lingua:Identification::EN for English, etc)... the downside of this
 solution is that once I have 50 languages, I'll have 51 modules... :-|
1 or 100, it doesn't realy matter as long as everything is installed in one
swoop.
Now you should test the load time. if having 50 modules add 3 seconds load
time, no one is going
to use your module (IMHO)

 Another possibility is to have everything in a single file, and allow
 the user to set up a configuration file himself, which may contain other
 languages...
I, as a user, wouldn't care which of the above methods you, the author, use
if I can:

1/override yours
2/ define mine

Hope this helps.

Cheers, Nadim.




advice needed on Lingua::Identification

2004-02-26 Thread jac
Hi, everybody.

I'm in need of advice here...

I'm putting together some things I have and creating a module named
Lingua::Identification. I won't go through details on why should this
module be created or anything else (unless someone asks me too) since
that has already been done (though not here, I know).

My problem is: the module has to retain some information about each
known language; this could easily be done by having persist on the
module itself... however, I also want the user to be able to *teach* a
new language to the module... how should I keep that information?

One possibility would be to force each known language to become a module
(Lingua:Identification::EN for English, etc)... the downside of this
solution is that once I have 50 languages, I'll have 51 modules... :-|
Still, CPAN can take care of things for us and install them all without
problem... but still, I'm not convinced... It is true that this would
allow the user to install only the desired languages and also ease the
learning process for new ones... besides, the module wouldn't have to
read unnecessary information on startup (there is the possibility of
identifying between only two languages, for example, so you don't need
to prepare all of them).

Another possibility is to have everything in a single file, and allow
the user to set up a configuration file himself, which may contain other
languages...

I don't know what's best... can you help me with this? Can you tell me
your opinion?

I've done quite a lot of programming in Perl, but I've never upload a
decent module on CPAN... I'm trying to do that for the first time, and I
don't want to screw things up right from the beginning...

Thanks to all of you.

Best regards,

jac


Re: advice needed on Lingua::Identification

2004-02-26 Thread Austin Schutz
On Thu, Feb 26, 2004 at 04:27:22PM +, [EMAIL PROTECTED] wrote:
 One possibility would be to force each known language to become a module
 (Lingua:Identification::EN for English, etc)... the downside of this
 solution is that once I have 50 languages, I'll have 51 modules... :-|
 Still, CPAN can take care of things for us and install them all without
 problem... but still, I'm not convinced... It is true that this would
 allow the user to install only the desired languages and also ease the
 learning process for new ones... besides, the module wouldn't have to
 read unnecessary information on startup (there is the possibility of
 identifying between only two languages, for example, so you don't need
 to prepare all of them).
 
 Another possibility is to have everything in a single file, and allow
 the user to set up a configuration file himself, which may contain other
 languages...
 
 I don't know what's best... can you help me with this? Can you tell me
 your opinion?
 

The answer (in my experience) is it depends. If it's simple to
remove the data for each language to a config and use identical logic for all
languages, that's usually the way to go.

If the logic for each language is different, having separate
modules is generally easier to maintain, because changing the logic for one
language in an algorithm that supports many languages may break the
functionality for the others. This may often seem easier at first, but it
doesn't scale too well.

There's also a combination of the two. For example, perhaps it makes
sense to combine Latin based languages together. Then perhaps you would
have:

Lingua::Identification
Lingua::Identification::Latin

and if necessary have an e.g. Lingua::Identification::EN inherit from
Latin, if you end up needing to have Lingua::Identification::EN to support
English differently than most Latin based languages, where latin based
languages are processed differently than other languages.
The other thing that's handy about this approach is that you can
have common logic in the superclass and only override parts that differ.

Hope that makes some sense. :-)

Austin