Have you considered Apache Solr/Lucerne? It may be a standard text 
classification problem for that technology.

Sent from my iPhone

> On 28 Dec 2015, at 6:04 PM, Jonathan Camilleri <[email protected]> 
> wrote:
> 
> I am trying to come up with an algorithm that parses and creates a machine 
> learning algorithm e.g. classifying URLs read from RDF files into categories.
> 
> The examples I have found so far were a bit limiting so I am asking if there 
> is any project that is worth mimicking.  I have done some experiments with 
> Eclipse but they were not very complete so far, I am now stuck at trying to 
> understand what syntax to use to read particular parts of a UDF file.
> 
> I have read tutorials at W3C as well, they appear to provide information on 
> the file formats.
> 
> Further reading
> 1. https://en.wikipedia.org/wiki/Bag-of-words_model
> 2. http://nlp.stanford.edu/software/CRF-NER.shtml
> 
> See attachments.
> 
> -- 
> Jonathan Camilleri
> 
> Mobile (MT): ++356 7982 7113 
> E-mail: [email protected]
> Please consider your environmental responsibility before printing this e-mail.
>  
> I usually reply to emails within 2 business days.  If it's urgent, give me a 
> call.
> 
>                       
> <ics_5111_dataset.zip>
> <assignment-reading the udf.docx>

Reply via email to