Thanks for your reply!
> > Background: I graduated from IIT Kharapur in 2009 and have been involved > in > > research in IR/NLP and Machine Learning for nearly 2 years. > > Would it be possible to provide links to any papers/presentations or, > code that you have published? > Yes definetely. Unfortunately my oldwebsite is down at the organisation i worked at. It contained more details of the projects. However here are the links to papers: https://dl.acm.org/citation.cfm?id=2010069&dl=ACM&coll=DL&CFID=316248085&CFTOKEN=31366376 http://www.aclweb.org/anthology/D11-1073 http://www.icwsm.org/2013/program/accepted-papers/ A recent one ('Detecting Comments on News Articles in Microblogs') > > > 1. I was wondering whether I could have a look at or have some > indication to > > the quality of files available. This will give me some idea about the > kinds > > of error > > The project idea requires the interested candidate to propose within > the scope of the project the kind of errors the initial > iteration/release will handle. > I would be happy to propose some methods to tackle errors. I was wondering whether I could have a look at the digitized text corpora itself. for e.g. I know there can be a wrongly recognized characters, spelling mistakes and such. However I thought I would get a better idea about other errors if I saw some of the documents for which such search would be built. Do you think this is possible? > > > 3. Does the IR system have to be implemented on top of Lucene (or other > open > > source software) or can be completely stand alone. > > I was hoping that we would be able to utilize ElasticSearch or, > similar. Lucene is an option too. > > I will look at ElasticSearch. Thanks again! Alok
_______________________________________________ Project-ideas mailing list [email protected] http://lists.ankur.org.in/listinfo.cgi/project-ideas-ankur.org.in
