FW: Google Summger of Code

Farid Bourennani Fri, 04 Apr 2008 08:51:25 -0700

Farid:
On Thursday 03 April 2008, Farid Bourennani wrote:
> > 3) any additional tools (such as GUI) required to be developed to prove
> > the my implementation?
>
> By GUI, I meant plotting tools in order to be able to visualize every
> iteration of the implemented machine learning algorithm and validate the
> final results graphically (eg. Gaussian VS Random Data).


Isabel Drost:
There should be some automated means of validating your results that does not
need human intervention. Where possible your algorithms should come with unit
tests to prove that they work.

Farid:
> 5) It was also mentioned on the project that "Students are also encouraged
> to work on projects related to their own machine learning research". Do
> that means that all the algorithms used have to posted right a way.

Isabel Drost:
Well, I think you should make available all code and libraries that you use in
a way that is compatible with both: The Apache Software License the code you
develop during your project will be licensed under. And the license the
libraries you want to use are licensed under.

That said you need to make available everything that is necessary for your
code to work correctly. It does not make a lot of sense to me, to include
some java module that one can only use if one owns a Matlab license. Or
worse, that only works with a library that is only available to your research
lab. But I guess, that was clear to you already ;)

Farid (NEW QUESTION)

I understand that the complete code must be published; no doubt about it! With 
attention to the project Lucene-Mahot is very close to my research thesis. So,  
I am aiming for a possible publication with some Hybrid learning algorithms. 
Correct me please if I am wrong: My understanding is the algorithm implemented 
is entirely the property of Apache and I would be very happy to contribute to 
the community. This being sad, are the publications related to the Hybrid 
machine learning algorithms are still the property university? I am not talking 
about the code here only, not about the publication. The reason of my question 
is that I am new in the Open-Source world as well as to the publication world: 
it's very exiting! I wanted only to clarify everything before very hopefully 
starting. 

Farid:
> 6)I assume that we will be using Lucene? Even though the learning
> algorithms can be used for different applications (Images, Speech
> recognition ...), I am more interested on Text algorithms specially since
> Lucene offers Stemming, , Stop Words Filtering, Text Normalization  and
> even Synonym Expansion functionalities.

Isabel Drost:
I think it should be fine to use Lucene for the preprocessing steps and for
feature extraction. It would be nice, if the algorithm was designed and
implemented general enough to allow others to use it for processing images,
speech or whatever they like - if that is possible and makes sense for your
algorithm.

Farid (NEW QUESTION)
That's not an issue, all the algorithms use VSM usually. I have already 
implemented some learning algorithms iin the past such a way learning machine 
algo could be applied to any type of data (image, speech...). However, I wanted 
only to know if the use of LUCENE is required, suggested or neither?

Regards,
Farid

FW: Google Summger of Code

Reply via email to