Hi Vaijanath,

Thanks so much for your help. It makes me more clearly. I will study the
thing in your link.
Yes, I have been using the Yahoo Search Boss, we can retrieve the text by
Apache Tika. But I think it takes a lot of time to manually to tag the Hotel
name as it require at least 15000 sentences.

Thanks so much,
Nguyen Van Tri.


On Wed, Jun 15, 2011 at 4:14 PM, Rao, Vaijanath
<[email protected]>wrote:

> Hi Tri,
>
> The link to download DBPedia data is http://wiki.dbpedia.org/Downloads36 .
> There might be some issues with DBPedia servers but I think that will be
> sorted out and we might be able to get data to download.  ( this might take
> a day or 2 to get corrected )
>
> Once I can download I should be able to help you more in getting DBPedia
> data to create training set for you.
>
> Regarding using Google search engine, might be a good option, but I have
> never tried myself as it will involve html parsing and other stuff. However
> I had in past used Yahoo's WebSearch API (
> http://developer.yahoo.com/search/web/V1/webSearch.html ) where you would
> get some description of the query term and would not involve writing html
> parser.
>
> Let me know if any of the above things helps you.
>
> --Thanks and Regards
> Vaijanath N. Rao
>
> -----Original Message-----
> From: Tri Nguyen [mailto:[email protected]]
> Sent: Wednesday, June 15, 2011 2:16 PM
> To: [email protected]
> Subject: Re: Hotel Name model
>
> Hi Vaijanath,
>
> It means that the title is the name of a hotel and you try to find the
> sentences containing that name to be train data line, am I correct? Can we
> get the urls of the article in DBPedia? I am sorry to ask you so much
> because I don't know about DBPedia.
> Since we can not download data from DBPedia, can we choose the hotel names
> and query to Google to collect the top pages to be data sets? But I think
> this way is not high precision.
>
> Thanks for your explanation,
> Nguyen Van Tri.
>
> On Wed, Jun 15, 2011 at 3:21 PM, Rao, Vaijanath
> <[email protected]>wrote:
>
> > Hi Tri,
> >
> > The link of DBPedia says that it identified hotel, now if we parse the
> > DBPedia data and get only those elements which have Hotel as it class
> > ( Or parent class) we can then mark that data for training. So Each of
> > the article in DBPedia will have title and description, So in worst
> > case we can look for title in the description and mark that entity name
> for training.
> >
> > For some reason DBPedia is not allowing me to download data. But Once
> > I get it to download I will able to code the wrapper from DBPedia to
> > OpenNLP in couple of days time.
> >
> > --Thanks and Regards
> > Vaijanath N. Rao
> >
> > -----Original Message-----
> > From: Tri Nguyen [mailto:[email protected]]
> > Sent: Wednesday, June 15, 2011 12:57 PM
> > To: [email protected]
> > Subject: Re: Hotel Name model
> >
> > Hi Vaijanath,
> >
> > Thanks so much for your reply. At first I think I can make a Hotel
> > model like the Job Title model which is described in chapter 6 of the
> > book Introduction to Linguistic Annotation an Text Analytics. But it
> > is difficult to me to choose the right corpus to build the train data.
> > Because Hotel is a sub class of the Organization class (
> > http://cs.nyu.edu/cs/faculty/grishman/NEtask20.book_8.html#HEADING26),
> > I think I can get the corpus of Organization model and remove the
> > non-hotel train data to be train data for Hotel model?. But, I don't
> > know what is the corpus to build Organization model? Could you show to me
> what is it?
> >
> > Could you please explain more detail on your link? You mean that we
> > can collect Hotel names and build a train data? I see a large list
> > hotel names at http://rtw.ml.cmu.edu/rtw/kbbrowser/pred:hotel, is it
> > helpful to us to build train data?
> >
> > Thanks so much for your patience to read long question,
> >
> > Nguyen Van Tri.
> >
> >
> > On Wed, Jun 15, 2011 at 12:35 PM, Rao, Vaijanath
> > <[email protected]>wrote:
> >
> > > Hi Tri,
> > >
> > > You can try Model similar to Organization and you would need some
> > > training data for Hotel. You can start looking at DBPedia data as
> > > initial Sample data.
> > >
> > > http://mappings.dbpedia.org/index.php/OntologyClass:Hotel ( This is
> > > Hotel ontology ). If there is a larger interest I can work on
> > > contibuting DBPedia Data as  training set for a particular type.
> > >
> > >
> > > --Thanks and Regards
> > > Vaijanath N. Rao
> > >
> > > ________________________________________
> > > From: Tri Nguyen [[email protected]]
> > > Sent: Wednesday, June 15, 2011 08:33
> > > To: [email protected]
> > > Subject: Hotel Name model
> > >
> > > Hi,
> > >
> > > Could somebody guide me how to build a Hotel Name model?
> > >
> > > Thanks,
> > > Tri.
> > >
> >
>

Reply via email to